Code Monkey home page Code Monkey logo

html-writer's Introduction

html-writer

A Racket library for converting X-expressions to strings of HTML with content-aware line wrapping and indentation.

Expected benefits of this library over xexpr->string, xexpr->html and display-xml/content:

  • Wraps and indents based on the tag type according to HTML5 standard (block/flow/inline) rather than based on scanning a tag's contents
  • Allows you to set the column width of the output
  • Never inserts a line break where doing so would introduce whitespace in the HTML output
  • Attribute values are never broken across lines
  • Does not modify content of <script> and <style> tags
  • Unicode-aware: wraps based on grapheme count vs. character count (depends on unicode-breaks)
  • Outputs boolean attributes the HTML5 way (<option selected> rather than <option selected="selected">)
  • Outputs self-closing tags the HTML5 way (<meta charset="UTF-8">, note > to close rather than />)

Still in early stages, comments and PRs welcome.

See the tests at the end of main.rkt to get an idea of what it can do so far. The tests that are there are passing at this point. Lots to do though:

  • Add many more tests
  • Add lookahead (or deferred output) for smarter wrapping (see this test)
  • Defer whitespace to avoid trailing spaces on lines (update tests first)
  • Compare classification of “block” and “flow” tags against the HTML5 standard
  • Thoroughly compare output against HTML Tidy
  • Add option to self-close tags the XHTML way (/>)
  • Split tests into their own files
  • Documentation

Questions to resolve

  • How important is the Unicode/grapheme thing and what is the performance cost of using in-words from unicode-breaks, as opposed to simply letting the wrapping be a little bit wrong when multi-byte graphemes are present?

  • How much logging/debugging instrumentation should be left in? (Probably none except for errors, but see next question)

  • What should happen when an X-expression's structure is not valid HTML, such as a <div> inside a <p>? Currently it just logs an error, but maybe it should throw an exception?

  • Support CDATA/PCDATA? (Probably not?) If not, then the things we are converting maybe aren’t technically X-expressions, what exactly are they and how should they be validated?

html-writer's People

Contributors

otherjoel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

html-writer's Issues

Whitespace between words is not always preserved

(display (xexpr->html5 '(p "What if there are linebreaks\nin the input?") #:wrap-at 20))

Produces:

<p>What if there 
are linebreaksin 
the input?</p>

but should produce

<p>What if there 
are linebreaks in 
the input?</p>

Input with spaces causes inconsistent indentation

When I have an xexpr? with some whitespace it causes the indentation to be wonky.

Input

(html
 ()
 "\n  "
 (head () "\n    " (title () "Minimal Post") "\n  ")
 "\n  "
 (body
  ()
  "\n    "
  (div
   ((class "article"))
   (article
    (h1 "Minimal Post")
    (p
     "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Maecenas nisi libero,\nscelerisque vel nulla eu, tristique porta metus.")))
  "\n  ")
 "\n")

Output

<html>
    <head>
        <title>Minimal Post</title>
  
  </head>
    <body>
        <div class="article">
      <article>
        <h1>Minimal Post</h1>
        <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Maecenas 
        nisi libero,scelerisque vel nulla eu, tristique porta metus.</p>
      </article>
    </div>
  
  </body>

</html>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.