Code Monkey home page Code Monkey logo

Comments (11)

flyx avatar flyx commented on June 7, 2024 1

Well YAML defines that the resolved type of an unquoted scalar shall depend only on its ancestors and its content. The ancestors basically define to which Nim type inside the Nim structure you are deserialising the scalar. As for the content, the core YAML schema defines that 2 should be resolved to an integer. NimYAML will honor this mapping iff the type that this scalar is being deserialised to is able to hold an integer value. The implicit variant is able to hold an integer value, so 2 is being deserialised to an integer. If you deserialise directly to a string, NimYAML is forced to resolve the type accordingly. String is basically the fallback type for any scalar. Whenever content and position allow to deserialise to a more specialised type, NimYAML will do that.

Another example that would behave the same way is the scalar true. If possible, NimYAML will deserialise it to a bool value; if not, it will become a string.

The alternative solution would be to require explicit tags on each value that is deserialised to an implicit variant. However, choosing the type implicitly based on position and content is what the YAML specification promotes, so this is what NimYAML does. If you dislike it, you can still use explicit tags (that would be what you would be doing if this implicit type choice would not exist anyway).

from nimyaml.

flyx avatar flyx commented on June 7, 2024

How would the structure's type look in Nim? As statically typed language, Nim itself cannot really have an arbitrarily typed field in an object.

You could perhaps store the unstructured subtree in a DOM node. I would be possible for the deserialiser to recognise a DOM node type inside a type and handle it by deserialising the YAML subtree into it (I would need to implement that though, not sure if trivial).

As for #28, I cannot do much unless the blocking Nim issue is resolved. However, seqs do get properly resolved as they are explicitly implemented in yaml.serialization and if you have custom generic types, you can implement a constructor and possibly representer yourself to avoid having to add tags.

from nimyaml.

jaccarmac avatar jaccarmac commented on June 7, 2024

If seqs get properly resolved then I am doing something wrong. Can you direct me toward what I'm doing wrong in this minimal example?

import tables
import yaml

type
    Child = object
        Name: string
        Children: seq[Child]
        Metadata: Table[string, MetadataValue]
    MetadataValueKind = enum
        mdvkString, mdkvStringSeq
    MetadataValue = object
        case kind: MetadataValueKind
        of mdvkString: stringVal: string
        of mdkvStringSeq: stringSeqVal: seq[string]
markAsImplicit(MetadataValue)
setDefaultValue(Child, Children, @[])
setDefaultValue(Child, Metadata, initTable[string, MetadataValue]())

let document = """Name: parent
Children:
   - Name: child
     Children:
         - Name:
           Metadata:
               Key: value
               OtherKey: [arbitrary, value]"""

var child_objects: seq[Child] = @[]
loadMultiDoc(document, child_objects)

It fails to run with the following stacktrace.

Traceback (most recent call last)
seq_string.nim(29)       seq_string
serialization.nim(1358)  loadMultiDoc
serialization.nim(1320)  construct
serialization.nim(1143)  constructChild
serialization.nim(918)   constructObject
serialization.nim(684)   constructObjectDefault
serialization.nim(1173)  constructChild
serialization.nim(422)   constructObject
serialization.nim(1143)  constructChild
serialization.nim(918)   constructObject
serialization.nim(684)   constructObjectDefault
serialization.nim(1173)  constructChild
serialization.nim(422)   constructObject
serialization.nim(1143)  constructChild
serialization.nim(918)   constructObject
serialization.nim(684)   constructObjectDefault
serialization.nim(1143)  constructChild
serialization.nim(502)   constructObject
serialization.nim(1119)  constructChild
[[reraised from:
seq_string.nim(29)       seq_string
serialization.nim(1358)  loadMultiDoc
serialization.nim(1324)  construct
]]
[[reraised from:
seq_string.nim(29)       seq_string
serialization.nim(1363)  loadMultiDoc
]]
Error: unhandled exception: Complex value of implicit variant object type must have a tag. [YamlConstructionError]

I will look into the custom deserializers, missed that in the docs. Thanks!

from nimyaml.

jaccarmac avatar jaccarmac commented on June 7, 2024

As for the structure in Nim, I've been doing some thinking and I suppose variant objects or a built-in DOM type (basically the same thing) is the way to go. In my head I was thinking of the Metadata field as a map of string to object in your standard OO language, but I guess in Nim modeling the pseudo-tagged-pointer that implies needs to be done manually.

from nimyaml.

flyx avatar flyx commented on June 7, 2024

Your problem does have little to do with seq. It is rooted in the fact that NimYAML must choose which of the implicit variants it will deserialize to when seeing the first event yielded for the content by the parser.

So on the line Key: value, the value produces a yamlScalar event, so NimYAML can deduce that this must be the mdvkString variant. If there were other possible variants with simple types (int, bool, …), NimYAML could scan the content and deduce which kind of scalar it is.

On the other line, OtherKey: [arbitrary, value], the value starts with a yamlStartSeq event. The content of the sequence is yet to be parsed, so NimYAML cannot take it into account when choosing a variant. This is true for every complex value, i.e. any value that is a YAML collection (sequence or mapping) instead of a scalar. This is why, whenever you are using a type that is loaded from a YAML collection into an implicit variant type, you must supply a tag so that NimYAML can know which variant the following complex value should be parsed into.

Note that there isn't really a better way to implement this, since two complex types in an implicit variant may be structurally identical, so that even by scanning the complete content of the YAML collection, NimYAML cannot deduce the actual type you want the collection to be parsed into. There are a lot of other problems that make deducing a type from a YAML collection structure impractical, such as possible cycles with anchors & aliases and so on.

That being said, what would theoretically be possible is to say „this implicit variant has only one complex type, so whenever I encounter a YAML collection, I need to deserialise it to that type“. Problem is, NimYAML in fact does not know which types are deserialised from YAML scalars and which are deserialised from YAML collections. This is an implementation detail from the constructors and representers. You could implement a custom constructor which parses the YAML scalar 1|2|3 into a seq[int]. So unless I add a mechanism to all constructors and representers that tells YAML whether they expect/generate a complex type or not, this cannot be implemented.

To sum up: Your use case scratches at the far end of what I ever anticipated to be done with NimYAML and thus is hindered by both conceptual and implementation restraints of which some may be lifted while others may not.

from nimyaml.

flyx avatar flyx commented on June 7, 2024

a map of string to object in your standard OO language

i.e. Java. This only works for languages where every data type is an object and thus every variable typed Object is a polymorphic pointer and can point to any actual type. My usual objection to this approach is that even if you think you do not know all possible structures that may occur, you do. Because you write code that processes the loaded data, and if you do the string to object map thing, that code implicitly defines which structures are actually acceptable based on instanceof constructs.

The only exception to this is code that transforms the structure. Typically, this code does not need any information about the data semantics and thus usually operates on a DOM structure (XSLT is designed for such use-cases with XML documents, for example).

However, I can accept that a valid use-case is when you do not process the data in such a variant subtree at all, instead just handing it through to some other service. This is why I think it would be a good idea to use DOM types for such subtrees, since they preserve the structure and do not imply any semantics on the content.

from nimyaml.

jaccarmac avatar jaccarmac commented on June 7, 2024

Those DOM types do not exist currently, correct? I would imagine they would look something like the existing yamlScalar etc.. events but object types instead of events.

I also ran into another interesting variant edge case with this schema. Making Metadata a Table[string, string], lets me parse a map like {one: one, two: 2} into a table of "one" to "one" and "two" to "2". However, making the second type in that Table a variant type means that I can no longer parse the literal "2" as a string, and I am forced to add an integer variant.

from nimyaml.

jaccarmac avatar jaccarmac commented on June 7, 2024

As for the use case, I agree with you in the general sense, but this specific case has some subtlety to it. Some keys need to be parsed in Nim, but the entire object is passed through to a frontend which uses JSON, where the data is necessarily untyped. So parts of the Metadata tree are typed while parts are not. And putting it that way, a custom deserializer seems like the way to go. Rather than trying to model each key-value pair of the metadata section those semantics can be encoded for the section as a whole.

Thanks for all the help!

from nimyaml.

flyx avatar flyx commented on June 7, 2024

If you use a variant type and want to make 2 a string, do any of these:

  • "2"
  • '2'
  • ! 2

The DOM type exists as YamlNode, albeit in the separate yaml.dom module. I will keep this issue open as allow subtrees to be parsed into YamlNode when constructing native types to be able to carry over untyped information, because that enables your use-case.

from nimyaml.

jaccarmac avatar jaccarmac commented on June 7, 2024

Thanks! And I understand how to force "2" into a string, I just find it inconsistent that NimYAML will parse an unquoted number as a string but not an implicit variant containing string. I assume it is due to the implementation of the implicit variant code.

EDIT: Specifically

NimYAML/yaml/serialization.nim

Lines 1084 to 1123 in 78758c8

when isImplicitVariantObject(result):
var possibleTagIds = newSeq[TagId]()
case item.kind
of yamlScalar:
case item.scalarTag
of yTagQuestionMark:
case guessType(item.scalarContent)
of yTypeInteger:
possibleTagIds.add([yamlTag(int), yamlTag(int8), yamlTag(int16),
yamlTag(int32), yamlTag(int64)])
if item.scalarContent[0] != '-':
possibleTagIds.add([yamlTag(uint), yamlTag(uint8), yamlTag(uint16),
yamlTag(uint32), yamlTag(uint64)])
of yTypeFloat, yTypeFloatInf, yTypeFloatNaN:
possibleTagIds.add([yamlTag(float), yamlTag(float32),
yamlTag(float64)])
of yTypeBoolTrue, yTypeBoolFalse:
possibleTagIds.add(yamlTag(bool))
of yTypeNull:
raise s.constructionError("not implemented!")
of yTypeUnknown:
possibleTagIds.add(yamlTag(string))
of yTypeTimestamp:
possibleTagIds.add(yamlTag(Time))
of yTagExclamationMark:
possibleTagIds.add(yamlTag(string))
else:
possibleTagIds.add(item.scalarTag)
of yamlStartMap:
if item.mapTag in [yTagQuestionMark, yTagExclamationMark]:
raise s.constructionError(
"Complex value of implicit variant object type must have a tag.")
possibleTagIds.add(item.mapTag)
of yamlStartSeq:
if item.seqTag in [yTagQuestionMark, yTagExclamationMark]:
raise s.constructionError(
"Complex value of implicit variant object type must have a tag.")
possibleTagIds.add(item.seqTag)
else: internalError("Unexpected item kind: " & $item.kind)
constructImplicitVariantObject(s, c, result, possibleTagIds, T)
, which I stumbled across on my debugging journey.

from nimyaml.

flyx avatar flyx commented on June 7, 2024

Comming back to this after many years, I now deprecated the DOM API and made it possible to use YamlNode with the normal serialization functions. This enables you to use YamlNode anywhere in the type of your object, to avoid having to deserialize that subtree into native Nim types.

from nimyaml.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.