Code Monkey home page Code Monkey logo

Comments (22)

kddnewton avatar kddnewton commented on May 18, 2024 2

Okay, I think what I'd like to go with is creating an --xml-whitespace-sensitivity option that has ignore, strict, and smart (or something like that. The first two are obviously far easier to implement (the last one requires a lot of knowledge about the schema and elements, I'm very interested in implementing it, but not as a first pass).

from plugin-xml.

winterkind avatar winterkind commented on May 18, 2024 1

@thorn0 I don't think there is anything wrong with the HTML parser. To quickly show what I meant see here.

It is XML with a given default namespace and a Title tag inside that namespace. The HTML parser throws an error because it assumes this is the Title of HTML (that's my guess at least). From my point of view that is okay because my content is not HTML but XML.

I liked your comment because it showed the effect that a whitespace option could have for our XML.

from plugin-xml.

kddnewton avatar kddnewton commented on May 18, 2024

Okay so a couple of things are going on here. Technically, the plugin is functioning as designed. Though I realize it's not the best user experience at the moment.

Unfortunately in XML whitespace is very significant unless you're inside of <>. So what's happening is when prettier is formatting these groups, it's maintaining the exact same amount of whitespace you had inside the start and end tags to maintain parity. Otherwise for a lot of XML use cases prettier would actually be changing the content unwittingly.

With lines 21-23, 25, 29, and 30, you're actually seeing where the indentation would be if whitespace were insignificant.

I'm really not entirely sure of the way around this. I've been thinking maybe an xmlIgnoreWhitespace option for those that don't care about the exact whitespace count, but it would have to be default off.

from plugin-xml.

winterkind avatar winterkind commented on May 18, 2024

Thanks for the explanation! To dumb it down for me (I am not the XML expert as you probably already noticed...):

Row 8 in the result is indented by two spaces because row 7 in the original is also indented by two spaces? Meaning tabWidth is not applied? In lines such as 21 - 23 it is applied, just not with the parent tag as basis but with their real "level" within the tree?

In that case I agree with your idea of having an option for that, so that tabWidth is applied to rows with tags as well. Because with this kind of files my ideal result would indent row 8 by 4 spaces, row 12 by 8 spaces etc.

PS: preserving the indentation inside multi-row strings is of course correct, I am not talking about those!

from plugin-xml.

kddnewton avatar kddnewton commented on May 18, 2024

So basically, when you have something like:

<foo>
  <bar />
</foo>

in XML that means that the content inside of the foo tag is actually: ["\n ", bar, "\n"]. The whitespace in this case is significant. As much as it looks like it's not because we've kind of come to agree that the spacing should look like a tree. But imagine a different example like:

<foo>
  bar <baz /> qux
</foo>

Now the content inside of the foo tag is actually: ["\n bar ", baz, " qux\n"].

from plugin-xml.

kddnewton avatar kddnewton commented on May 18, 2024

So yeah, basically I'm thinking of trying to provide a rule that says if you don't have mixed content inside the tags, then we could probably just consider the whitespace as insignificant and indent as we like, but right now it would definitely change things otherwise and potentially break XML integrations.

from plugin-xml.

bd82 avatar bd82 commented on May 18, 2024

Specification wise XML is indeed whitespace sensitive.

However what % of real XML files are really whitespace sensitive?
Perhaps the default of ignoring whitespace in contents would be more functionally useful even if not 100% correct while those users who have meaningful whitespace would need to explicitly turn the WS ignoring off?

If we do not ignore whitespace it is very difficult to get a real "pretty XML"...
But I do not know what is the % of users who have meaningful whitespace?

This issue is what stumped me when I was playing around with a prettier plugin for
XML a few months ago... 😒

from plugin-xml.

kddnewton avatar kddnewton commented on May 18, 2024

I think it pretty much has to be an option. Otherwise this is really only prettier-ifying open/close tags and prolog content.

from plugin-xml.

kddnewton avatar kddnewton commented on May 18, 2024

Now to figure out what to name it. xmlIgnoreWhitespace? xmlWhitespaceInsignificant? xmlWhitespaceSignificantOnlyOnMixedContent?

from plugin-xml.

thorn0 avatar thorn0 commented on May 18, 2024

For HTML, the analogous option of Prettier is called --html-whitespace-sensitivity.

Also schema languages (DTD, XML Schema, RELAX NG, Schematron) are used to define, among other things, whitespace rules for elements. If the document has an associated schema, you should be able to get the needed info about whitespace significance from there.

Finally, you can probably somehow use the standard xml:space attribute, but I don't know much about it.

from plugin-xml.

bd82 avatar bd82 commented on May 18, 2024

My 2 cents:

I would go with the option name --html-whitespace-sensitivity linked above.
but only keep two of the three options:

  • "strict" - Whitespaces are considered sensitive.
  • "ignore" - Whitespaces are considered insensitive.

And use "ignore" by default and see if any users complain. πŸ˜„

I'd probably also emphasize this option and its default behavior in the documentation.
and only in the longer term evaluate more granular control e.g via xml:space attribute.

from plugin-xml.

thorn0 avatar thorn0 commented on May 18, 2024

Sounds good for starters.

The "css" value would make sense too as XML is one of the two syntaxes of HTML5.

And it might be possible to come up with a simple, but good enough heuristic for another value, "guess". E.g. "if an element doesn't have any non-whitespace text nodes among its children, then its whitespace-only child text nodes can be ignored". Something like that.

from plugin-xml.

winterkind avatar winterkind commented on May 18, 2024

I would go with the option name --html-whitespace-sensitivity linked above.
but only keep two of the three options:

  • "strict" - Whitespaces are considered sensitive.
  • "ignore" - Whitespaces are considered insensitive.

And use "ignore" by default and see if any users complain.

That sounds very reasonable. In our specific case, all whitespaces are insignificant, except for those inside multi-row strings. So if a value of ignore would also ignore those, then a third value like stringsOnly would be good, which would only keep whitespace inside of multi-row strings.

from plugin-xml.

thorn0 avatar thorn0 commented on May 18, 2024

@winterkind BTW, did you try formatting your files with Prettier's HTML formatter and --html-whitespace-sensitivity ignore? Should work fine in your case: playground.

from plugin-xml.

winterkind avatar winterkind commented on May 18, 2024

@thorn0 You are right - that looks very good, also with our larger files. It just needs some convincing at this point. The "babel" parser complains about HTML comments and the explicit HTML parser has issues with our default namespace not being HTML.

But as I said: with the "babel" parser and after removing comments, the output is excellent.

from plugin-xml.

thorn0 avatar thorn0 commented on May 18, 2024

@winterkind I meant the html parser. What issues exactly did you have with it? Could you demonstrate them using a link to the playground? (Better in a new issue in https://github.com/prettier/prettier to not hijack this issue with things not related directly to plugin-xml.)

from plugin-xml.

bd82 avatar bd82 commented on May 18, 2024

I am unsure if HTML is always a valid (syntactically) XML.

from plugin-xml.

thorn0 avatar thorn0 commented on May 18, 2024

@bd82 It's usually not, but that doesn't mean Prettier's HTML parser (Angular's actually) can't be smart enough to support XML. It's something I'd like to investigate when I get some time.

from plugin-xml.

infotexture avatar infotexture commented on May 18, 2024

Re. #40 (comment):

Also schema languages (DTD, XML Schema, RELAX NG, Schematron) are used to define, among other things, whitespace rules for elements. If the document has an associated schema, you should be able to get the needed info about whitespace significance from there.

Yep, proper context-sensitive whitespace handling is basic table stakes for an XML processing toolchain, so a simple on/off, strict/ignore approach wouldn't be sufficient, the plug-in would need to toggle the mode based on the current element.

Finally, you can probably somehow use the standard xml:space attribute, but I don't know much about it.

The schema languages typically handle that for each XML dialect, so the schema will typically use @xml:space to say something like β€œpreserve all whitespace in elements named <codeblock>”.

For example, from the DITA spec:

@xml:space
This attribute is provided on <pre>, <lines>, and on elements specialized from those.
It ensures that parsers in editors and transforms respect the white space, including line-end
characters, that is part of the data in those elements. It is intended to be part of the default
properties of these elements, and not for authors to change or delete. When defined, it has a
fixed value of "preserve".

from plugin-xml.

kddnewton avatar kddnewton commented on May 18, 2024

@winterkind would you be able to test the whitespace-ignore branch on this (and set the xmlWhitespaceSensitivity to ignore)? Want to make sure it works for your use case.

from plugin-xml.

kddnewton avatar kddnewton commented on May 18, 2024

Actually I've just gone ahead and released a v0.6.0 with the changes. I'm going to close this and if we find a need for the smart mode in the future we can build it. For now @winterkind try setting xmlWhitespaceSensitivity to "ignore" and it should be better for you.

from plugin-xml.

winterkind avatar winterkind commented on May 18, 2024

Hi, sorry for not responding earlier; I was sick for a few days. I really appreciate your help with this.

I tried 0.6.0 with the new option and it looks very good. I'll test with some more of our files and if I find anything else, I'll create new issues.

from plugin-xml.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.