Code Monkey home page Code Monkey logo

Comments (2)

jgm avatar jgm commented on July 20, 2024

Currently the syntax for attributes (undocumented except in code comments) is

 attributes <- '{' whitespace* attribute (whitespace attribute)* whitespace* '}'
 attribute <- identifier | class | keyval
 identifier <- '#' name
 class <- '.' name
 name <- (nonspace, nonpunctuation other than ':', '_', '-')+
 keyval <- key '=' val
 key <- (ASCII_ALPHANUM | ':' | '_' | '-')+
 val <- bareval | quotedval
 bareval <- (ASCII_ALPHANUM | ':' | '_' | '-')+
 quotedval <- '"' ([^"] | '\"') '"'

So we don't allow . in an identifer. I can't recall whether there was a specific reason for this.
XML identifiers are more restrictive than this (must start with letter or underscore). HTML4 identifiers are less restrictive, and HTML5 identifiers are much less restrictive.

Class names have more restrictions (at least if they're to be used with CSS).

EDIT: Anyway, I'm open to making this less restrictive, but some thought needs to go into what would be a reasonable restriction.

from djot.

bmschmidt avatar bmschmidt commented on July 20, 2024

At first glance, it seems like djot has a principal to not distinguish between the first character and other characters in ids, possibly for simplicity of implementation? Which dictates that . can't appear in ids because '.' name indicates a class? Or possibly it's just that classes and ids follow the same pattern, and class name in djot may not contain periods (which I agree is a good decision).

HTML4 identifiers are less restrictive

As I understand it HTML4 ids are generally extremely restrictive, because they follow the SGML rules laid out ISO 8879:1986. #1, #:, and are all invalid HTML4 identifiers or class names, but valid djot identifiers because they don't start with [A-Za-z].

The only case I see where djot is more restrictive than HTML4 is that "foo.bar" is a valid HTML4 identifier but an invalid djot identifier because it contains a .This difference prevents a lot of pretty basic ascii-encoded HTML4 from being able round-trip through djot back to HTML.

I have one firm proposal, which is to disentangle the identifier and class rules to allow non-initial identifier characters to be periods. I.e.:

 identifier <- '#' nameChar Maybe[subsequentIdChar+]
 class <- '.' nameChar+
 nameChar <- (nonspace, nonpunctuation other than ':', '_', '-')
 subsequentIdChar <- (nonspace, nonpunctuation other than ':', '_', '-', '.')

My goals would be served equally well by the parser accepting periods on ids in any position but requiring them to be escaped (\.). But that feels uglier.

I don't have opinions about any larger related changes, though I do like how unicode characters can be id and class names in djot.


Just for context, I should possibly say that my interests here are not primarily in writing in DJOT, but in getting things into djot's AST, which is much nicer to work with than pandoc's for my purposes.

from djot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.