Code Monkey home page Code Monkey logo

Comments (19)

getify avatar getify commented on July 19, 2024

Hmmmmm...

Despite various theories on html comments in JS, I think these aren't actually comments according to JS.

I think they are pre-stripped by the HTML parser. According to the ES5.1 grammar, // and /* */ are the only validly recognized comments. So I doubt very much that the JS engine is even seeing them by the time it lexes the code.

I decided to ask on twitter to verify. We'll see what comes up.

Also, tried esprima and tried traceur, and neither of them agreed that HTML comments were in any way valid JS. It's clear they both expect for HTML comments to have been processed before they get the code.

from literalizer.

getify avatar getify commented on July 19, 2024

This doesn't necessarily mean that literalizer shouldn't identify these special (unexpected) comments. Very interesting question. If you use literalizer to process arbitrary (legacy) JS code, code which might be destined for inclusion in an inline <script> tag in an HTML page, you may very well have those HTML comments strewn throughout them (regardless of if that's a good idea).

Right now, literalizer just ignores and skips over them. Well, not entirely. It treats those special characters like actual JS operators, which affects the heuristics of where blocks are allowed, etc. So it's quite possible that an HTML comment could throw off literalizer's identification of regex literals that immediately follow blocks. It's also quite possible that an HTML comment in a certain location could throw off other heuristics.

One might argue that literalizer should recognize them and call them out as specific element types. One might argue that literalizer should recognize them (that is, not let them muck with heuristics), but should just skip over them as regular JS text.

Or, one might just do as esprima and the other parsers seem to do, and punt on them (expect they are not there, and you're at your own risk if you do put them in).

I am quite undecided at this point. There's very valid arguments on all 3 sides.

from literalizer.

getify avatar getify commented on July 19, 2024

Well... here we go, @mathiasbynens responded. It is a non-standard but de-facto part of all JS engines in browsers.

That doesn't change anything about my indecision as to whether literalizer should align its behavior more with this non-standard stuff in browsers, or more like all the other JS-based parsers/tools out there, which seem to universally NOT handle these types of comments.

from literalizer.

getify avatar getify commented on July 19, 2024

Uh oh... @rwaldron drops the mic.

var a = 1, b = 1; a <!--b;

This is bad, because if JS tooling doesn't recognize the <!-- as a single line comment, but the browser engine does, then the tooling will come up with different results for the code. That sucks. :(

from literalizer.

mathiasbynens avatar mathiasbynens commented on July 19, 2024

That doesn't change anything about my indecision as to whether literalizer should align its behavior more with this non-standard stuff in browsers, or more like all the other JS-based parsers/tools out there, which seem to universally NOT handle these types of comments.

Maybe this helps: what is your definition of “non-standard”? Not part of ECMAScript? There is a standard that defines these things, so technically it’s not non-standard.

Ideally, JavaScript parsers and other tools would implement JavaScript, as in, Web ECMAScript, instead of just ECMAScript. (Well, ideally, there wouldn’t be a separate JavaScript / Web ECMAScript spec, and everything (incl. compatibility requirements) would be defined by the ECMAScript spec itself. We’re slowly getting there.)

from literalizer.

getify avatar getify commented on July 19, 2024

@mathiasbynens but does it make sense for only one JS tool to implement to a different standard than the other JS tools? How does that help tool interop? Is(n't) tooling interop and predictability every bit as important as predictability of code interpretation at different layers of the stack?

I'm not saying I have an answer yet. I'm quite torn.

from literalizer.

mathiasbynens avatar mathiasbynens commented on July 19, 2024

Is(n't) tooling interop and predictability every bit as important as predictability of code interpretation at different levels?

No.

When tools have bugs (and not accounting for real-world usage is definitely a bug), they can and should be fixed. Behavior in JavaScript engines and in browsers that is there because of compatibility requirements is unlikely to be changed/removed.

So, the only way to improve the situation is to fix the broken tools.

from literalizer.

getify avatar getify commented on July 19, 2024

@mathiasbynens except that it's unclear to me if literalizer was the only tool to start recognizing HTML comments that this would be seen as a benefit of literalizer which the other tools might aspire to live up to, or if it would be seen as detrimental (from the tooling perspective) to deviate from all the other tools in some "strange" way?

literalizer wants "market share" just like browsers do. It needs to behave in a way that makes it a peer to other JS tools, or better, but certainly not seen as "non-standard" or, worse, "sub-standard". ;-)

All this is to say that the decision about how to handle this is more complicated than it seemed at first glance.

from literalizer.

ljharb avatar ljharb commented on July 19, 2024

Sorry for not providing more context initially. Worth nothing that node (because it uses v8) supports this syntax too.

Since one of the main purposes of literalizer is to preparse difficult literals so that other tools can parse the rest - whether for enforcing style, syntax, or anything else - I think it's a good idea for any possibly valid syntax to be supported.

It could definitely be a new segment type - it may not be a "comment" - but it's something.

from literalizer.

getify avatar getify commented on July 19, 2024

confirmed that node does in fact allow these. surprises me. but proves this is a discussion worth having.

but the question of "valid syntax" is a relative term, IMO. "valid" according to who? before this morning, I would have said that ES is the only who that matters. now I'm not so sure.

and as I said above, it still matters to me how other JS tools are going to address this question. I'm going to file bugs with them and see how those conversations shake out.

from literalizer.

ljharb avatar ljharb commented on July 19, 2024

All sorts of JS libraries account for the fact that certain older browsers return 'function' for the typeof a regex, despite that not being part of the spec. I'd say shooting for ES, but deviating where engines do (as long as it doesn't prevent valid ES syntax from working) is how a tool should work.

I look forward to hearing what other tools decide!

from literalizer.

mathiasbynens avatar mathiasbynens commented on July 19, 2024

from literalizer.

getify avatar getify commented on July 19, 2024

Further info on tooling behavior:

  • jslint rejects it as invalid syntax
  • uglifyjs rejects it as invalid syntax
  • jshint actually breaks completely, and just errors out. ugly.

from literalizer.

getify avatar getify commented on July 19, 2024

I've also posted to the JS-tools list to see what others there think.

from literalizer.

getify avatar getify commented on July 19, 2024

Here's a discussion thread from awhile back on why uglifyjs considered and ultimately rejected them: mishoo/UglifyJS#346

from literalizer.

getify avatar getify commented on July 19, 2024

It's apparently slated to standardize in ES6: http://people.mozilla.org/~jorendorff/es6-draft.html#sec-B.1.3

Given that literalizer is already going to great lengths (other filed issues) to support all the complexities that ES6 foists upon us, if they are in fact going to standardize HTML-style comments, we probably should as well. I don't want to. But we probably should.

from literalizer.

ljharb avatar ljharb commented on July 19, 2024

I agree with you on all counts.

from literalizer.

getify avatar getify commented on July 19, 2024

uglifyjs landed a patch this morning, as did Acorn. esprima has accepted it and will patch soon. JSHint is probably going to have to patch for it (since they have a buggy crash/fail).

The cards are falling, and so it's clear now that literalizer will need to do so as well.

However, I probably will have this a configurable option as suggested in #14. Given what we know, the safer thing is probably to have it defaulted on, but then let someone turn that off if they aren't targeting such an engine and/or if they don't want to allow such things through.

If you turn off literalizer tokenizing these, and it encounters them, it will treat them as constituent operators, which might in fact throw off the heuristics of other identifications, but that will just be a risk you explicitly take when you turn off this config option.

from literalizer.

getify avatar getify commented on July 19, 2024

landed this change just now. It's on by default, but can be turned off by setting this option to true: overlook_html_comment_markers

from literalizer.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.