Code Monkey home page Code Monkey logo

Comments (15)

vassudanagunta avatar vassudanagunta commented on June 19, 2024 3

About terminology, I'd say that "IR" is the genus and "AST" is one species.

ok, thank you.

RE the bigger question, some things to consider:

  1. I would say that Pandoc's solves the problem of translation between so many different input and output forms by defining an IR that is syntax independent and more or less a semantic superset of those syntaxes. Then the question becomes whether representing paragraphs that span block quotes (or other elements, see below) is a universal or common enough to warrant complicating Pandoc's IR.

  2. An old W3C www-html list discussion: Re: Lists within Paragraphs. An excerpt:

    I think this is part of a bigger problem.  Paragraph's can't contain block
    level elements.  At first this seems to make a lot of sense.  But it
    doesn't work in many instances.
    
    For example often block level mathematical formulas occur in paragraphs.
    If we consider
    
                      x + y = z
    
    as such an example, we see that in this case this paragraph is the still
    the same one, but we have a block level element in it.
    
  3. The HTML spec's ultimate answer admits that paragraphs might logically span block elements, but that it doesn't apply to the HTML standard:

    List elements (in particular, ol and ul elements) cannot be children of p elements. When a sentence contains a bulleted list, therefore, one might wonder how it should be marked up.

    For instance, this fantastic sentence has bullets relating to

    • wizards,
    • faster-than-light travel, and
    • telepathy,

    and is further discussed below.

    The solution is to realize that a paragraph, in HTML terms, is not a logical concept, but a structural one. In the fantastic example above, there are actually five paragraphs as defined by this specification: one before the list, one for each bullet, and one after the list.

    The markup for the above example could therefore be:

    <p>For instance, this fantastic sentence has bullets relating to</p>
    <ul>
    <li>wizards,
    <li>faster-than-light travel, and
    <li>telepathy,
    </ul>
    <p>and is further discussed below.</p>
    

    Authors wishing to conveniently style such "logical" paragraphs consisting of multiple "structural" paragraphs can use the div element instead of the p element.

    Thus for instance the above example could become the following:

    <div>For instance, this fantastic sentence has bullets relating to
    <ul>
    <li>wizards,
    <li>faster-than-light travel, and
    <li>telepathy,
    </ul>
    and is further discussed below.</div>
    

    This example still has five structural paragraphs, but now the author can style just the div instead of having to consider each part of the example separately.

  4. Allowing paragraphs to span/nest block elements provides, I think, a cleaner and more consistent solution to "tight lists". For example, the following would be a tight list because each list item contains exactly a single element (a paragraph):

    - para 1
    - para 2
      - a
      - b
      - c
    - para 3
    

    The current CommonMark solution has flaws, as can be seen by comparing

    - item one
    - item two
      # a heading
      more text
    - item three
    

    with

    - item one
    - item two
      
      a heading
      ---------
      more text
    - item three
    

    Both should be treated as loose lists since the second item in each contains block sequences, but CommonMark's determination is based on the existence or lack thereof of blank lines in the source, not logical structure.

I hope this is helpful. Please let me know if you've had enough! It just happens to be a question I've been trying to tackle myself.

from djot.

uvtc avatar uvtc commented on June 19, 2024 2

Re. @bpj 's suggestion about indenting: would this cause a problem with putting lists between paragraphs? That is, with a list you may (and typically) indent the list marker. Is there a difference between a list that's its own paragraph vs a list that's in the midst of a paragraph?

from djot.

bpj avatar bpj commented on June 19, 2024 1

Why not indentation for an embedded blockquote? That seems the most intuitive to me.

from djot.

bpj avatar bpj commented on June 19, 2024 1

I mean that if what follows the blockquote is a continuation the blockquote is indented == the blockquote is embedded in a paragraph.

paragraph

    > blockquote inside paragraph

rest of paragraph (continuation)

vs.

first paragraph

> blockquote after paragraph

another paragraph

I hope that is clearer.

from djot.

uvtc avatar uvtc commented on June 19, 2024 1

Thinking about a syntax for, "anyhow, as I was saying", I was going to suggest ..., as in:

The boat ride took us through the everglades.

> It was one of those "airboats" with the giant propeller.

... We saw a lot of birds but no alligators.

But that causes a pretty big indent, and ... already automatically gets you a "…" in djot, and it might cause problems when the author wants an actual ellipses.

The leading underscore is ok, but also does make me think italics.

Since "and" is at least somewhat close to "anyhow, as I was saying", maybe &?

The boat ride took us through the everglades.

> It was one of those "airboats" with the giant propeller.

& We saw a lot of birds but no alligators.

I like that one because,

  • & is not currently used for any other djot syntax,
  • I can read it as "and" ("and as I was saying") and it kinda works. :)
  • the glyph itself also looks somewhat like any other alphabet letter, and so is not as distracting (does not stand out so much on the page) as the _, which I think is a desirable characteristic for this bit of markup.

from djot.

david-christiansen avatar david-christiansen commented on June 19, 2024 1

Including lists in paragraphs is an important use case for the kind of writing that I do, at least, and neither the suggestion of a leading _ nor the suggestion of indentation work well for this case.

I would need to distinguish between all of the following:

A:

Some text:
\begin{itemize}
\item A
\item B
\end{itemize}
And more text

B:

Some text:

\begin{itemize}
\item A
\item B
\end{itemize}
And more text

C

Some text:
\begin{itemize}
\item A
\item B
\end{itemize}

And more text

D

Some text:

\begin{itemize}
\item A
\item B
\end{itemize}

And more text

Leading underscore works to distinguish A from C. But not to distinguish A from B, nor C from D. It catches part of the A/D distinction.

Indentation doesn't work for any of them.

An alternative design is a convention that there's a div for "multi-paragraphs" that contain multiple block elements. It's ugly but accurate:

::: {.paragraph}
Some text:

* A
* B

And more text
:::

would denote option A.

This would be tool specific, however, but that's perhaps OK - I think the need for this kind of thing tends to arise in long-form scientific writing more than in smaller, casual documents, so having a Googleable solution like this is perhaps OK. This also remains compatible with the various ASTs out there.

from djot.

jgm avatar jgm commented on June 19, 2024 1

One could use a single dot on a line as a "connector" that says: the following normally-block-level thing is to be considered as part of the current paragraph. Then your A is

Some text
.
- A
- B
.
more text

and your B is

Some text

- A
- B
.
more text

and so on. Of course, this would require figuring out an AST model that actually permits this sort of thing. And some (most?) output formats just won't allow a list or a block quote to be part of a paragraph: in HTML for example, a p element can only contain "phrasing content."

from djot.

jgm avatar jgm commented on June 19, 2024

The question is not about the syntax of the block quote, but about how to mark what follows it as either a new paragraph or a continuation of the previous one.

from djot.

jgm avatar jgm commented on June 19, 2024

Yes, got it now.

from djot.

vassudanagunta avatar vassudanagunta commented on June 19, 2024

Does jdot's AST support block elements nested within a paragraph?

from djot.

jgm avatar jgm commented on June 19, 2024

We wouldn't need the AST to support block elements as children of a paragraph. It would be sufficient just to be to mark the following content as "not a new paragraph."

from djot.

vassudanagunta avatar vassudanagunta commented on June 19, 2024

@jgm,

We wouldn't need the AST to support block elements as children of a paragraph. It would be sufficient just to be to mark the following content as "not a new paragraph."

I understand. Would you mind answering a related long standing question I've had about terminology?

Is there a distinction between an abstract syntax tree and an intermediate representation? Since djot parses to an AST, and since you are proposing a new djot syntax for paragraph continuation, your suggested approach above makes sense. But if instead you needed to model a general abstraction of structured text, independent of any specific syntax, such as the "AST" at Pandoc's core, then it might be better to represent it as a single paragraph with a nested block quote, yes? And whether or not that is the better representation of this specific case, would you agree that there is nonetheless a difference between an AST and an IR, and that the core data structure of Pandoc is better characterized as an IR?

from djot.

jgm avatar jgm commented on June 19, 2024

It might make more sense conceptually to allow a block quote to be a child of a paragraph. But this would make the interface with Pandoc's types more complicated. I don't know what is best.

About terminology, I'd say that "IR" is the genus and "AST" is one species.

from djot.

dsanson avatar dsanson commented on June 19, 2024

Just commenting to second @bpj's proposed use of indentation for this.

from djot.

mygithubdevaccount avatar mygithubdevaccount commented on June 19, 2024

AsciiDoc uses the plus sign (+) as the so called list continuation: https://docs.asciidoctor.org/asciidoc/latest/syntax-quick-reference/#ex-complex

from djot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.