Code Monkey home page Code Monkey logo

jazzle-parser's Introduction

jazzle (a.k.a jsRube)

Join the chat at https://gitter.im/JazzleWare/jazzle-parser A small, simple, and ridiculously fast parser for all versions of ECMAScript/Javascript, written in plain ECMAScript3, on which I have been working on and off since September 2015, under codename 'lube'.

A bug in v8 (and consequently in node) made it very difficult to run on node versions 5 and below. The bug has been resolved, and it now runs smoothly (and fater than any other parser I know of) in node v6.2.0+. Please bear this notice in mind while trying to use this parser.

#Features It always records the location data, range data, and raw value of every node, and still it parses jQuery-1.4.2 2x or 3.5x faster than esprima 2.7.2, depending, respectively, on whether the latter doesn't record the location/ranges or it does. Funnily enough, it does all the above while keeping track of as much early errors as I could find in the spec.

It is almost completely esprima-compatible (except when things get annoying, in which case it is acorn-compatible).

#Future

  • cleaner source
  • tolerant parsing
  • even lighter weight
  • descriptive errors
  • more comments
  • finer grained control over parsing (via more options, possibly)
  • a demo website
  • standalone regex verifier (currently, regex verification is accomplished by means of the underlying engine's RegExp constructor, which, while not a defendable approach, is the most straightforward; needless to mention, it's also currently the sole approach)

#Using in the browser Include the file ./dist/jazzle.js in a <script> tag. It exposes the Parser constructor, and parse utility function. One use case could be:

var code = 'sample(code);';
var result;

result = new Parser(code, false).parseProgram();

// or alternatively
result = parse(code, false)

NOTE in ES versions before ES2015, any given source was treated as a 'script'; in ES2015 and above, this is not the case anymore -- sources can be parsed as scripts and as modules. You have to explicitly tell the parser if you want it to parse your code as a 'module' rather than a 'script' by sending the value true as the second argument to the Parser constructor or the parse method:

var code = 'import * as a from "l"';
// please note the `true` there; it tells the parser to treat the code as module code;
// because `import`s are module-specific source elements, 
var result = new Parser(code, /*-->*/true/*<--*/).parseProgram(); 

#Building In jazzle repository's root, run the build script, i.e., ./builder/run.js;

node ./builder/run.js

It bundles the sources under the 'src' directory in to a single file, to be found under dist/jazzle.js. It also runs a self-test after bundling is complete; the parser should only be used if the test stage passes without any errors.

#Quick Testing Even though a thorough test is performed during the build process (that is, while building via ./builder/run.js), quick tests occasionally come in handy. To run quick tests, do:

node ./test/run.js

#Benchmarking Before beginning to run a benchmark, make sure you have 'esprima', 'acorn', and 'benchmark' packages installed; if it is not the case, install them this way:

npm install esprima@latest
npm install acorn@latest
npm install benchmark@latest

Then run the actual benchmarking facility this way:

node ./bench/run.js

This will feed the corpus located under sources into each parser, asks them to parse each file while recording node location data, collects the timings for each parser, and prints the results.

#Using jazzle via npm First,

npm install jazzle

Then:

var jazzle = require( 'jazzle' );
console.log( jazzle.parse('var v = "hi !";') );

jazzle-parser's People

Contributors

gitter-badger avatar icefapper avatar zenekron avatar zetlen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

jazzle-parser's Issues

send as few arguments to `err` as possible

this.err currently receives more info via errParams than the baseline error reporter needs.
this was for the upcoming tolerant mode, but tolerant subsystem is going to get implemented in a totally new way so sending those extra params will no longer be necessary.

is there a more straightforward (and possibly more lightweight) approach for tracking the so-called "tricky" nodes?

Hi
Before we begin, let's go over a few definitions.

A "potpat" is a node that has the potential of becoming an assignment- and/or a binding-pattern -- these nodes areMemberExpression, AssignmentExpression, ObjectExpression, Property (almost), Identifier, and ArrayExpression; please not though, that while [a] is a potpat, -[a] is not, because it can't appear at left of an assignment.

A "parpos" node is a node that can be an arrow parameter. in ([a, { b = [[e], l] = 12}]), a and b are parpos nodes; in -([a]), there is no parpos, because the - behind the ( will make it impossible for the paren to serve as the parameter list of an arrow function.

A "tricky" situation (as I have no other name for it) is, broadly, a situation in which the node is probably an error, but its erroneousness can not be ascertained.

consider the following cases:

"use strict";
[ eval, // not yet an error, but as a part of a "potpat", it might well turn into an error.
  arguments = 12 // an error, but may not be the first one, so it will not immediately throw
]
; // <-- now we are sure about `arguments = 12` being an actual error

[ eval,
  arguments = 12
]
= // <-- now we are sure about `eval` being an actual error; `arguments = 12` is no longer the first error
12;

[{ a= b }, // possible error #1
 [ arguments ] = 12, // possible error #2
]
= // raise #2
12;

(
   12, // surprisingly, it is not considered tricky, because it is not potpat
   [(a)], // possible error #1 -- if it is a parpos, `(a)` will be an invalid parameter
   e * 12, // this is not tricky either
   {a=b} // possible error #2 -- if it isn't a parpos, it will be an unsatisfied assignment
)
; // raise #2

(
    [(a)], // possible error #1
   12,
    e * 12, 
    {a=b} // possible error #2
) => /* <-- raise #1 */ 12
// please note that, in the case above, if `12` had come first, it'd have been the error finally raised --
// "possible" errors, like their name suggests, are raised if only no error has happened before them.

function* l() {
   "use strict";
   (a=yield, // possible error #1-- it is a parpos node, and if it turns out to be an actual param, 
                 // it is not allowed to contain a yield expression
    arguments=12 // possible error #2
    )
    ; // raise #2

    (a=yield, // possible error #1
     arguments = 12 // possible error #2
    ) => /* <-- raise #1 */ 'l';
}  

There is also an extreme case; considering we are in a generator, what should the error for the following code be?

(yield)=>12

rather simple -- it should be "yield can not be an array parameter when in a generator".

But what about this one (again considering we are inside a generator)?

(yield = 12) => 12

This one should raise the same error as above; but this means we should actually postpone an outright error -- a syntactic one (rather than a semantic one, which would've been easier to deal with) --until the ) is reached.
But of course things are not that hard, and the "syntactical" error we are worrying about is only contextually considered a syntax error, since in a non-generator, non-strict context, yield = 12 is indeed allowed.

The case is dismissed though -- yield is an actual keyword inside a generator, so (yield)=>12 and (yield=12)=>12 will be just as erroneous as (while)=>12 and (while=12)=>12.

That makes for the foreword.

Jazzle is currently using multiple variables (firstYS, parenYS, firstElemWithYS, firstParen, firstUnassignable, firstNonTailRest, firstEA, firstEAContainer ๐Ÿ’ฆ) to track all the issue above, and while it does a decent job tracking all these tricky cases, it still looks like to be doing it in a more complicated fashion than it actually should.
I believe a more straightforward (and more lightweight) approach has got to be found.

As an analogy, there are two ways of counting sheep.

One is to count the hooves, and divide the result by 4; this approach needs top-notch counting skills.
The other is to count the heads; even I can do it.

But looks like jazzle in its current state is counting the hooves.

bugs, missing features and spec violation

@icefapper

  1. Import and Export is broken. Does not handle early errors. E.g

    var a, b; export default a; export { b as default };
    export { a, b as c }
    export default 1; export default 2;
    export { a as default }
    export { a }

All this fails. See also #12

  1. Async parses for ES6
  2. Exponent parses for ES6
  3. ESTree violation. Author call it "extra information".

And does not handle

({a({e: a.b}){}})
(function* ({e: a.b}) {})
(function ({e: a.b}) {})

And most of the todos in the code haven't been fixed for months

Build system introduces redundant boilerplate

I noticed that the build system has created a lot of boilerplate and indirection in the bundle. It's effectively a concatenation script, but with some special semantics that are plainly very unidiomatic, and introduce numerous otherwise-unnecessary closures at load time.

To give an example, here's a snippet from your src/[email protected]

this.err = function(errorType, errParams) {
  errParams = this.normalize(errParams);
  return this.errorListener.onErr(errorType, errParams);
};

You could just as easily do this instead:

Parser.prototype.err = function(errorType, errParams) {
  errParams = this.normalize(errParams);
  return this.errorListener.onErr(errorType, errParams);
};

Doing this would let you just make your build script a glorified cat program, that just happens to wrap everything in an IIFE.

This would be much simpler to write and maintain, even if you decided to keep the naming convention. In fact, UglifyJS2 already does similar.

accept options in relevant locations

that is, the exported parse function and the Parser constructor; the options ought to be esprima/acorn compatible, making jazzle a transparent drop-in.

location issues

@icefapper Hi! I tried this module. Awsome work!

But is there any way I can turn of this misleading location stuff ? In Acorn and Esprima that is off by default, and can be activated through options.

And I also noticed that the location is not compatible with either Acorn or Esprima. Is this something that will be fixed?

Missing features

@icefapper

  • object spread (Acorn have had this for a year at least)
  • dynamic import
  • new template features
  • JSX (a must to have)

About 0.6-dev

@icefapper I noticed the effort you put into splitting the codebase into smaller components but unfortunately, in my opinion, it's not completely on the mark yet.

For now you should only think about node's environment and focus on making the best use of its module system. A practical example of this would be what I've done in the parser folder where:

  1. Prototype functions are defined one-per-file in parser/parse and parser/util.
  2. Shorthands that are meant to be used by only a function (like #asArrowFuncArgList and #asArrowFuncArg being called only by Parser#parseArrow) are now bound to that function module's scope.
  3. The constructor is a function defined in parser/constructor.js.
  4. The parser's entry point is parser/index.js which automatically binds the prototype to its constructor.

Pros & Cons

Pros

  • Hierarchical organization of modules
  • No global space pollution (like with the _class)
  • Better scope management thanks to modules (you expose only what you need)
  • Natively supported by node (no need for compilation/transpilation of any sort)

Cons

  • The browser version requires to be generated via browserify

Additional notes

While the use of browserify might look a bit restrictive, it is very advantageous instead. Why? It can generate UMD builds!. This way jsRube would work as intended independently of how it is loaded in the browser (be it a src tag, an AMD module loader a CommonJS module loader or whatever).

By the way, if you want to discuss privately, I'm always avaiable at [email protected].

`new` submodule requires a serious scrutiny

almost all other submodules have been rewritten. this one requires something along those lines too -- things CONTEXT_UNASSIGNABLE_CONTAINER are still in there even though they've been swept out for quite some time actually.

Tests are going to fails without being adjusted first

Hello
Just wantd to say jsRube's AST are slightly different from those of esprima; for example, while jsRube keeps nodes' start and end locations ('loc') in 'start' and 'end', respectively, esprima keeps them in 'ranges'. The code at the beginning of the function 'compare' in module './util.js' is actually the 'adjuster' code. Thanks a lot reading this far.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.