Code Monkey home page Code Monkey logo

Comments (7)

witten avatar witten commented on May 18, 2024 1

Piggybacking on this issue, although perhaps I should open a new one.. Here's another potentially valid use of tags: YAML file includes. Here's an actual example from the wild:

retention:
    !include /etc/borgmatic/common_retention.yaml

The idea is that a common YAML fragment gets dynamically included into the YAML document in question at runtime. The main rationale is reuse, so as to avoid having to repeat common configuration in multiple documents. (Usable by non-programmers? Perhaps, although it is admittedly a little advanced.)

But wait, I hear you say, can't you do that outside of strictyaml, and then still use strictyaml for the other aspects of YAML parsing and validation? You can, but not without problems. Two alternative approaches I can think of:

  1. Prior to feeding the YAML document to strictyaml, pre-process it (e.g. with raw ruamel.yaml) to inline all includes and produce a single document. This works, but then any line numbers in strictyaml error messages are completely bogus in relation to the source YAML files.
  2. Or, escape the include tags (as suggested by a comment above), and then after strictyaml parses and validates the YAML document, post-process the include directives at the application level. This doesn't really work well though, because then you give up strictyaml schema validation on any part of the YAML document that's pulled in by an include. And in fact, schema validation may simply not work if a required part of the document is hidden behind an include tag that strictyaml doesn't understand.

To be clear, I'm not making a feature request here for file include functionality in strictyaml (although that'd be pretty great). Rather, I'm making the case that support for custom tags in strictyaml would be pretty darn useful — even necessary for some use cases.

from strictyaml.

crdoconnor avatar crdoconnor commented on May 18, 2024

Hi Simon,

Thanks for the comments, kind words, etc. Sorry I haven't answered earlier. This is a good comment and I'll link to it in the docs.

I was looking at the removed features, and it lists explicit tags as being a form of syntax typing, which is absolutely bad when it's defined by the schema, yes! But tags don't have to be used that way, they can be used as a reserved syntax for alternate ways to provide a value

I think this is a valid way of using them, but equally I think that they are not a necessary feature in order to implement that. I follow the rule of least power pretty assiduously when I define DSLs. The corollary of that principle being that unless I consider a powerful new feature necessary I leave it out - "usefulness" is not enough of a prerequisite on its own.

In your examples, I think that the benefits of using tags could still be achieved fairly easily without using them, so it fails the necessity test. The schema language I have defined with strictyaml could possibly make parsing your example above a bit easier (and I'd be happy to make improvements of that kind), but there's nothing intrinsically stopping it from being done even right now.**

As another example that is closer to the original justification for removal, you can also (and it is probably the original intention of tags) use tags to provide types better syntax and lower (user) implementation cost where it's not directly providable by the schema

This absolutely happens. However, where this is being done I'd consider it a bug in the schema that needs to be fixed - and not a bug that the schema language should attempt to work around.

I feel like your second example is actually slightly confusing to a non-programmer - the notion that exclamation points should be used in one place but not the other, for instance.

That said, I'm perfectly OK with strictyaml not supporting tags for implementation or compatibility complexity

It's mainly because I don't feel like the schema language should have knowledge or opinion of types beyond string, mapping and list because doing so opens up such an incredible can of worms. The problem I had that actually kicked this project off was largely due to some of those worms.

Where there is a need for users to have options on how they supply data that have type implications I feel like moving the problem of handling types to the programmer writing the parser is a better solution - it keeps the schema language from being overloaded with cruft that will confuse things.

** There is a minor exception in that it will forcefully reject the use of unquoted ! because it intentionally disallows this feature. If you truly wanted a strictyaml schema that had a smart interpretation of strings that start with ! it would have to start with a quote (') or be done using a multiline string (|).

from strictyaml.

simonbuchan avatar simonbuchan commented on May 18, 2024

Yeah, there is definitely a syntax boost with correctly used tags, but they definitely aren't required. As a personal preference, I would probably keep them, since the syntax guides the semantics, a property I like to preserve, but I'm not the one that did the work of creating a language, so I don't get to complain too much!

Your footnote implied you were thinking "!some-type some-value"? I would definitely avoid being that close to a yaml feature in a yaml subset. Further, in the first example, it's a bit sucky to shadow valid values and require another level of escaping 🤷‍♂️. I'm happiest with the existing non-tag syntax AWS has, an object that has one property, e.g. !foo bar -> foo: bar. The biggest problem with that is YAML (and thus strictyaml) doesn't support nesting property names on one line, which makes cases where every property has a "typed" value much noisier and harder to read, as seen in the examples.

I've also seen (including in the same AWS example!) the object with a type property, which can also work out well if you're going to have an object anyway. Supporting this generically is possible with type-unions (and in typed implementation languages, discriminators support), but gets tricky to give good error messaging (e.g. in your own example for unions it confusingly reports "expected an integer" for Bool() | Int()), so it probably makes more sense to support this at the schema level directly if this is recommended usage.

If you were thinking of additions to cover this kind of usage, the schema supporting (some generalization of?) "exactly one of these properties" would be the thing I would suggest - perhaps the key validator can do this?

I feel like your second example is actually slightly confusing to a non-programmer - the notion that exclamation points should be used in one place but not the other, for instance.

I am always wrong about what confuses and doesn't confuse non-programmers, so it's quite possible 😅. That said, I think it's handy that the exclamation point is "loud" here, saying how it's different. e.g., "you can only have one name here, but it can be one of a set of valid names, each of which have different contents". Only usability testing would say for sure, but I doubt the difference would be so significant that it would outweigh other concerns, either design principles or the library vs (code) usage complexity concerns on the other side.

from strictyaml.

crdoconnor avatar crdoconnor commented on May 18, 2024

Hi @witten thanks for your comment

Prior to feeding the YAML document to strictyaml, pre-process it (e.g. with raw ruamel.yaml) to inline all includes and produce a single document. This works, but then any line numbers in strictyaml error messages are completely bogus in relation to the source YAML files.

If you write your own processing step which picks up a filename from the 'master' document and then tries to read it with another schema from the 'child' included document then the line number of any schema violation for the child document would be correct, would it not?

Or, escape the include tags (as suggested by a comment above), and then after strictyaml parses and validates the YAML document, post-process the include directives at the application level. This doesn't really work well though, because then you give up strictyaml schema validation on any part of the YAML document that's pulled in by an include.

Well, you could validate them separately, could you not?

from strictyaml.

witten avatar witten commented on May 18, 2024

Thanks for the quick response.

If you write your own processing step which picks up a filename from the 'master' document and then tries to read it with another schema from the 'child' included document then the line number of any schema violation for the child document would be correct, would it not?

Yes, but not easily! With the particular include approach I happen to be using: A user can decide to factor out and include any arbitrary portion of the main YAML document. So I don't necessarily have a separable schema for just the fragment that they've put in a separate file. I suppose, before feeding the YAML with escaped includes to strictyaml, I could try to dynamically split the main schema into separate portions based on where the includes are. But then that'd require both a pre-processing step (to locate the includes and split up the schema) and a post-processing step (to interpret the includes and apply the sub-schemas at the application-level).

from strictyaml.

crdoconnor avatar crdoconnor commented on May 18, 2024

Is there a particular issue with that approach? I have quite a few systems using strictyaml which do multipass validation.

from strictyaml.

witten avatar witten commented on May 18, 2024

It just seems like a lot of work — three passes, and fair amount of complexity to make the schemas separable at runtime — to do something that could in theory be done in one pass. But I'll give it a shot anyway. 😃

For comparison, my current non-strictyaml code does one pass to load, and then does validation on the resulting data structure in memory. (I do realize that performance is not one of strictyaml's main goals.)

from strictyaml.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.