Code Monkey home page Code Monkey logo

Comments (20)

sideshowbarker avatar sideshowbarker commented on June 10, 2024

This seems like a nice potential savings.

Agreed

@sideshowbarker are you familiar enough with the various things Wattsi can output to write a regex to detect such line numbers? I recall them being in parenthesis; maybe just (\d+)?

We could always do it that way or I think it’s also possible that we could have wattsi exit with a particular code (I mean other than just 1) if it has reported any any line numbers as part of the error message. Then of course we just make the build script check for that exit code.

If we can actually make wattsi do that, shall we handle it that way?

from html-build.

sideshowbarker avatar sideshowbarker commented on June 10, 2024

A very quick look back at the wattsi sources indicates that maybe the only time wattsi ever actually reports line numbers is for parse errors. If that’s the case it should be very easy to just make it exit with a different code if it has found any parse errors.

from html-build.

domenic avatar domenic commented on June 10, 2024

That makes a lot of sense to me, yeah.

from html-build.

domenic avatar domenic commented on June 10, 2024

Although in theory we could consider adding line numbers to other errors in the future, instead of talking about the nearest heading... that's a separate topic though.

from html-build.

sideshowbarker avatar sideshowbarker commented on June 10, 2024

Although in theory we could consider adding line numbers to other errors in the future, instead of talking about the nearest heading... that's a separate topic though.

Yeah, that would definitely make things much easier for contributors. So I guess we should open another wattsi issue for that. In the case of the parse errors, looking at the source I see they’re handled as actual exceptions, inheriting from an Exception class, and I think by doing it that way, we kind of get the line and column numbers for free.

from html-build.

sideshowbarker avatar sideshowbarker commented on June 10, 2024

In the case of the parse errors, looking at the source I see they’re handled as actual exceptions, inheriting from an Exception class, and I think by doing it that way, we kind of get the line and column numbers for free.

So the part of the wattsi source that actually adds the line and column numbers to the message is at https://github.com/whatwg/wattsi/blob/master/src/html/htmlparser.pas#L1195 so while they’re not just coming from some kind of generic input exception, it seems they are exposed by FInputStream.Line and FInputStream.Column from whatever stream wattsi is reading from.

Given all that and thinking about what the non-parsing parts of the code are doing, I realize the challenge is that the other parts of the code are probably not working from a stream at all but instead just on the tree the parser builds/exposes. So at those points in the code I think there are no line/column numbers readily available. Which is why Hixie didn’t have line/column numbers in those messages to begin with, and why it will take some significant work to implement any kind of line/column reporting for them.

Anyway, it’s definitely worth having an issue open to track.

from html-build.

domenic avatar domenic commented on June 10, 2024

I guess the trick would be to associate a locationinfo (line/column) with each element at parse time, that can later be used by Describe().

from html-build.

sideshowbarker avatar sideshowbarker commented on June 10, 2024

I guess the trick would be to associate a locationinfo (line/column) with each element at parse time, that can later be used by Describe().

Yeah, exactly. That’s what I had initially thought the existing code might already have. But given that wattsi doesn’t seem to be hooking into (or built on top of) any generic parsing API (like SAX or whatever) that might provide a locationinfo thingy, it seems like some custom locationinfo provider would need to be written into the parser (and then exposed as a member of element objects in the tree).

from html-build.

sideshowbarker avatar sideshowbarker commented on June 10, 2024

So whatwg/wattsi#8 will cause the exit code for wattsi (local) to be set to 65 in the case when there are line numbers in the message output—which as discussed here earlier only occurs with the current wattsi code when wattsi has emitted one or more parse errors.

So I can easily update the build script to check for that special 65 exit code in the wattsi local case—but if catching that special exit code is how we want to handle it in the local case, then we need some similar way to catch in the wattsi-server case.

So, a proposal: Change wattsi-server behavior to be that it returns the HTTP code 400 (“malformed syntax”) only in the (parse-error-with-line-numbers reported) case where the its local wattsi has exited with 65 exit code, and otherwise returns the HTTP code 422 (“syntactically correct but semantically erroneous”) if other (non-parser) errors were found (i.e., errors that wattsi finds while examining the document tree and for which is currently does not report line numbers).

from html-build.

sideshowbarker avatar sideshowbarker commented on June 10, 2024

Or to more closely reflect the actual wattsi exit code, maybe instead return a 465 HTTP code for the wattsi exit-code 65 case, and just continue returning a 400 for any other case where wattsi has found an error.

from html-build.

domenic avatar domenic commented on June 10, 2024

Hmm. I'm down with the 400/422 plan. But what do you think of outputting the wattsi exit code directly for consumption? Either as a wattsi-exit-code.txt file inside the zip, or as a header in the response? I am not sure which would be better.

from html-build.

sideshowbarker avatar sideshowbarker commented on June 10, 2024

But what do you think of outputting the wattsi exit code directly for consumption?

Yeah I think that would be better all around than trying to shoehorn it into HTTP semantics.

Either as a wattsi-exit-code.txt file inside the zip, or as a header in the response? I am not sure which would be better.

I think sending it as a header in the response would be cleaner and less error-prone than writing it out as a file and packaging it up with the output. But I don’t feel strongly about it.

from html-build.

domenic avatar domenic commented on June 10, 2024

The build server now delivers a Wattsi-Exit-Code header. It also accepts a ?quiet query string, and I recommend we switch the URL to add /wattsi so that we can stop serving wattsi for all URLs to allow e.g. whatwg/build.whatwg.org#4 to work at a different build URL.

from html-build.

sideshowbarker avatar sideshowbarker commented on June 10, 2024

The build server now delivers a Wattsi-Exit-Code header. It also accepts a ?quiet query string,

Excellent. So with that I think we can get a PR landed for this shortly.

recommend we switch the URL to add /wattsi so that we can stop serving wattsi for all URLs to allow e.g. whatwg/build.whatwg.org#4 to work at a different build URL.

SGTM. So you wanna go ahead and just add the /wattsi URL now? And then I can update the build script (and after that’s landed you could disable the old URL).

from html-build.

domenic avatar domenic commented on June 10, 2024

Right now the build server actually accepts any incoming HTTP request and does the same thing :). So /wattsi already works. What I meant is that in the course of making your changes if you could change it to /wattsi, it will still work, and then once that's merged I can update the server to only accept /wattsi.

from html-build.

sideshowbarker avatar sideshowbarker commented on June 10, 2024

ah, OK, gotcha

from html-build.

foolip avatar foolip commented on June 10, 2024

Oh, this is what I was trying to fix myself today, and asked @sideshowbarker about on IRC. It looks like the main trouble was making it work both for local and remote builds?

from html-build.

sideshowbarker avatar sideshowbarker commented on June 10, 2024

Oh, this is what I was trying to fix myself today, and asked @sideshowbarker about on IRC. It looks like the main trouble was making it work both for local and remote builds?

Yes but the necessary changes are now in place so it’s just a matter of hooking the build script into them.

from html-build.

foolip avatar foolip commented on June 10, 2024

I'm working on extracting the Wattsi-Exit-Code header value, almost there.

from html-build.

foolip avatar foolip commented on June 10, 2024

I think #56, #57 and #58 together should take care of this. Saves a few seconds for a local build for me, and way more for a remote build.

from html-build.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.