Comments (20)
This seems like a nice potential savings.
Agreed
@sideshowbarker are you familiar enough with the various things Wattsi can output to write a regex to detect such line numbers? I recall them being in parenthesis; maybe just (\d+)?
We could always do it that way or I think it’s also possible that we could have wattsi exit with a particular code (I mean other than just 1
) if it has reported any any line numbers as part of the error message. Then of course we just make the build script check for that exit code.
If we can actually make wattsi do that, shall we handle it that way?
from html-build.
A very quick look back at the wattsi sources indicates that maybe the only time wattsi ever actually reports line numbers is for parse errors. If that’s the case it should be very easy to just make it exit with a different code if it has found any parse errors.
from html-build.
That makes a lot of sense to me, yeah.
from html-build.
Although in theory we could consider adding line numbers to other errors in the future, instead of talking about the nearest heading... that's a separate topic though.
from html-build.
Although in theory we could consider adding line numbers to other errors in the future, instead of talking about the nearest heading... that's a separate topic though.
Yeah, that would definitely make things much easier for contributors. So I guess we should open another wattsi issue for that. In the case of the parse errors, looking at the source I see they’re handled as actual exceptions, inheriting from an Exception
class, and I think by doing it that way, we kind of get the line and column numbers for free.
from html-build.
In the case of the parse errors, looking at the source I see they’re handled as actual exceptions, inheriting from an Exception class, and I think by doing it that way, we kind of get the line and column numbers for free.
So the part of the wattsi source that actually adds the line and column numbers to the message is at https://github.com/whatwg/wattsi/blob/master/src/html/htmlparser.pas#L1195 so while they’re not just coming from some kind of generic input exception, it seems they are exposed by FInputStream.Line
and FInputStream.Column
from whatever stream wattsi is reading from.
Given all that and thinking about what the non-parsing parts of the code are doing, I realize the challenge is that the other parts of the code are probably not working from a stream at all but instead just on the tree the parser builds/exposes. So at those points in the code I think there are no line/column numbers readily available. Which is why Hixie didn’t have line/column numbers in those messages to begin with, and why it will take some significant work to implement any kind of line/column reporting for them.
Anyway, it’s definitely worth having an issue open to track.
from html-build.
I guess the trick would be to associate a locationinfo (line/column) with each element at parse time, that can later be used by Describe().
from html-build.
I guess the trick would be to associate a locationinfo (line/column) with each element at parse time, that can later be used by Describe().
Yeah, exactly. That’s what I had initially thought the existing code might already have. But given that wattsi doesn’t seem to be hooking into (or built on top of) any generic parsing API (like SAX or whatever) that might provide a locationinfo thingy, it seems like some custom locationinfo provider would need to be written into the parser (and then exposed as a member of element objects in the tree).
from html-build.
So whatwg/wattsi#8 will cause the exit code for wattsi (local) to be set to 65
in the case when there are line numbers in the message output—which as discussed here earlier only occurs with the current wattsi code when wattsi has emitted one or more parse errors.
So I can easily update the build script to check for that special 65
exit code in the wattsi local case—but if catching that special exit code is how we want to handle it in the local case, then we need some similar way to catch in the wattsi-server case.
So, a proposal: Change wattsi-server behavior to be that it returns the HTTP code 400
(“malformed syntax”) only in the (parse-error-with-line-numbers reported) case where the its local wattsi has exited with 65
exit code, and otherwise returns the HTTP code 422
(“syntactically correct but semantically erroneous”) if other (non-parser) errors were found (i.e., errors that wattsi finds while examining the document tree and for which is currently does not report line numbers).
from html-build.
Or to more closely reflect the actual wattsi exit code, maybe instead return a 465
HTTP code for the wattsi exit-code 65
case, and just continue returning a 400
for any other case where wattsi has found an error.
from html-build.
Hmm. I'm down with the 400/422 plan. But what do you think of outputting the wattsi exit code directly for consumption? Either as a wattsi-exit-code.txt file inside the zip, or as a header in the response? I am not sure which would be better.
from html-build.
But what do you think of outputting the wattsi exit code directly for consumption?
Yeah I think that would be better all around than trying to shoehorn it into HTTP semantics.
Either as a wattsi-exit-code.txt file inside the zip, or as a header in the response? I am not sure which would be better.
I think sending it as a header in the response would be cleaner and less error-prone than writing it out as a file and packaging it up with the output. But I don’t feel strongly about it.
from html-build.
The build server now delivers a Wattsi-Exit-Code header. It also accepts a ?quiet query string, and I recommend we switch the URL to add /wattsi so that we can stop serving wattsi for all URLs to allow e.g. whatwg/build.whatwg.org#4 to work at a different build URL.
from html-build.
The build server now delivers a Wattsi-Exit-Code header. It also accepts a ?quiet query string,
Excellent. So with that I think we can get a PR landed for this shortly.
recommend we switch the URL to add /wattsi so that we can stop serving wattsi for all URLs to allow e.g. whatwg/build.whatwg.org#4 to work at a different build URL.
SGTM. So you wanna go ahead and just add the /wattsi URL now? And then I can update the build script (and after that’s landed you could disable the old URL).
from html-build.
Right now the build server actually accepts any incoming HTTP request and does the same thing :). So /wattsi already works. What I meant is that in the course of making your changes if you could change it to /wattsi, it will still work, and then once that's merged I can update the server to only accept /wattsi.
from html-build.
ah, OK, gotcha
from html-build.
Oh, this is what I was trying to fix myself today, and asked @sideshowbarker about on IRC. It looks like the main trouble was making it work both for local and remote builds?
from html-build.
Oh, this is what I was trying to fix myself today, and asked @sideshowbarker about on IRC. It looks like the main trouble was making it work both for local and remote builds?
Yes but the necessary changes are now in place so it’s just a matter of hooking the build script into them.
from html-build.
I'm working on extracting the Wattsi-Exit-Code
header value, almost there.
from html-build.
I think #56, #57 and #58 together should take care of this. Saves a few seconds for a local build for me, and way more for a remote build.
from html-build.
Related Issues (20)
- Update highlighter once it updates widlparser
- Build does not fail on invalid Web IDL HOT 5
- Review Drafts have MDN in them; they should not HOT 2
- --password warning HOT 2
- MDN boxes sometimes have multiple entries for the same browser HOT 3
- Remove caniuse boxes in favor of MDN compat data boxes? HOT 9
- Build failing, potentially Python? HOT 2
- Migrate to Python 3 HOT 2
- Use Docker Hub pdfsizeopt instead of downloading from GitHub
- Document why macOS cannot do HTTP/2 by default HOT 2
- Consider a "fast mode" for local iteration HOT 3
- .pre-process-main.pl can be simplified or rolled into Wattsi HOT 2
- Add service worker support
- html-build repo is missing License text file
- "Improvements to the CI Docker build" broken PDF links HOT 1
- `.cache` folder is not created by `build.sh` HOT 1
- Docker build fails on M1 MacBook HOT 5
- Build failing locally when trying to send to server HOT 1
- Local build doesn't catch HTML parsing errors
- Automatically generate bibliography entries
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from html-build.