simonw / datasette-render-markdown Goto Github PK

View Code? Open in Web Editor NEW

23.0 4.0 0.0 49 KB

Datasette plugin for rendering Markdown

License: Apache License 2.0

Python 100.00%

datasette datasette-plugin markdown datasette-io

datasette-render-markdown's Introduction

datasette-render-markdown

Datasette plugin for rendering Markdown.

Installation

Install this plugin in the same environment as Datasette to enable this new functionality:

datasette install datasette-render-markdown

Usage

You can explicitly list the columns you would like to treat as Markdown using plugin configuration in a metadata.json file.

Add a "datasette-render-markdown" configuration block and use a "columns" key to list the columns you would like to treat as Markdown values:

{
    "plugins": {
        "datasette-render-markdown": {
            "columns": ["body"]
        }
    }
}

This will cause any body column in any table to be treated as markdown and safely rendered using Python-Markdown. The resulting HTML is then run through Bleach to avoid the risk of XSS security problems.

Save this to metadata.json and run Datasette with the --metadata flag to load this configuration:

$ datasette serve mydata.db --metadata metadata.json

The configuration block can be used at the top level, or it can be applied just to specific databases or tables. Here's how to apply it to just the entries table in the news.db database:

{
    "databases": {
        "news": {
            "tables": {
                "entries": {
                    "plugins": {
                        "datasette-render-markdown": {
                            "columns": ["body"]
                        }
                    }
                }
            }
        }
    }
}

And here's how to apply it to every body column in every table in the news.db database:

{
    "databases": {
        "news": {
            "plugins": {
                "datasette-render-markdown": {
                    "columns": ["body"]
                }
            }
        }
    }
}

Columns that match a naming convention

This plugin can also render markdown in any columns that match a specific naming convention.

By default, columns that have a name ending in _markdown will be rendered.

You can try this out using the following query:

select '# Hello there

* This is a list
* of items

[And a link](https://github.com/simonw/datasette-render-markdown).'
as demo_markdown

You can configure a different list of wildcard patterns using the "patterns" configuration key. Here's how to render columns that end in either _markdown or _md:

{
    "plugins": {
        "datasette-render-markdown": {
            "patterns": ["*_markdown", "*_md"]
        }
    }
}

To disable wildcard column matching entirely, set "patterns": [] in your plugin metadata configuration.

Markdown extensions

The Python-Markdown library that powers this plugin supports extensions, both bundled and third-party. These can be used to enable additional Markdown features such as table support.

You can configure support for extensions using the "extensions" key in your plugin metadata configuration.

Since extensions may introduce new HTML tags, you will also need to add those tags to the list of tags that are allowed by the Bleach sanitizer. You can do that using the "extra_tags" key, and you can allow-list additional HTML attributes using "extra_attrs". See the Bleach documentation for more information on this.

Here's how to enable support for Markdown tables:

{
    "plugins": {
        "datasette-render-markdown": {
            "extensions": ["tables"],
            "extra_tags": ["table", "thead", "tr", "th", "td", "tbody"]
        }
    }
}

GitHub-Flavored Markdown

Enabling GitHub-Flavored Markdown (useful for if you are working with data imported from GitHub using github-to-sqlite) is a little more complicated.

First, you will need to install the py-gfm package:

$ pip install py-gfm

Note that py-gfm has a bug that causes it to pin to Markdown<3.0 - so if you are using it you should install it before installing datasette-render-markdown to ensure you get a compatibly version of that dependency.

Now you can configure it like this. Note that the extension name is mdx_gfm:GithubFlavoredMarkdownExtension and you need to allow-list several extra HTML tags and attributes:

{
    "plugins": {
        "datasette-render-markdown": {
            "extra_tags": [
                "hr",
                "br",
                "details",
                "summary",
                "input"
            ],
            "extra_attrs": {
                "input": [
                    "type",
                    "disabled",
                    "checked"
                ],
            },
            "extensions": [
                "mdx_gfm:GithubFlavoredMarkdownExtension"
            ]
        }
    }
}

The <input type="" checked disabled> attributes are needed to support rendering checkboxes in issue descriptions.

Markdown in templates

The plugin introduces a new template tag: {% markdown %}...{% endmarkdown %} - which can be used to render Markdown in your Jinja templates.

{% markdown %}
# This will be rendered as markdown
{% endmarkdown %}

You can use attributes on the {% markdown %} tag to enable extensions and allow-list additional tags and attributes:

{% markdown
  extensions="tables"
  extra_tags="table thead tr th td tbody" 
  extra_attrs="p:id,class a:name,href" %}
## Markdown table

First Header  | Second Header
------------- | -------------
Content Cell  | Content Cell
Content Cell  | Content Cell

<a href="https://www.example.com/" name="namehere">Example</a>
<p id="paragraph" class="klass">Paragraph</p>
{% endmarkdown %}

The extensions= and extra_tags= attributes accept a space-separated list of values.

The extra_attrs= attribute accepts a space-separated list of tag:attr1,attr2 values - each tag can specify one or more attributes that should be allowed.

You can also use the {{ render_markdown(...) }} function, like this:

{{ render_markdown("""
## Markdown table

First Header  | Second Header
------------- | -------------
Content Cell  | Content Cell
Content Cell  | Content Cell
""", extensions=["tables"],
    extra_tags=["table", "thead", "tr", "th", "td", "tbody"])) }}

The {% markdown %} tag is recommended, as it avoids the need to \" escape quotes in your Markdown content.

datasette-render-markdown's People

Contributors

Stargazers

Watchers

datasette-render-markdown's Issues

Metadata to configure wildcard patterns

For configuring patterns, I'll go with this:

{
    "plugins": {
        "datasette-render-markdown": {
            "patterns": ["*_md"]
        }
    }
}

This supports multiple patterns. It also means you can disable wildcard column matching entirely with "patterns": [].

Originally posted by @simonw in #1 (comment)

Double the whitespace on latest Datasette

simonw/datasette#896 added white-space: pre-wrap to table cells - which means Markdown columns now get double the whitespace:

https://latest-with-plugins.datasette.io/fixtures?sql=select+%27%23+Hello+there%0D%0A%0D%0A*+This+is+a+list%0D%0A*+of+items%0D%0A%0D%0A%5BAnd+a+link%5D%28https%3A%2F%2Fgithub.com%2Fsimonw%2Fdatasette-render-markdown%29.%27%0D%0Aas+demo_markdown

`{% markdown %} ... {% endmarkdown %}` template tag

While using render_template() for pages in https://datasette.io/tutorials I kept running into this annoyance:

{{{ render_markdown("""
# Data analysis with SQLite and Python
"example" and 'example'
""") }}}

This raises an error - the double quotes need to be \" escaped.

If you switch to ''' then you need to use \' for the single quotes instead.

After some exploration, it looks like the solution is to implement this:

{% markdown %}
# Data analysis with SQLite and Python
"example" and 'example'

{% endmarkdown %}

Plugin configuration: configure columns

Plugin configuration options should include:

Enable plugin for specific columns on specific tables
Disable the *_markdown wildcard feature (now in #3)

Markdown renders with lots of vertical white space

The plugin renders markdown with lots of extra vertical white space. eg.

Note this is just a set of titles, paragraphs and lists. No blank paragraphs or line breaks.

The culprit appears to be the CSS, specifically:

white-space: pre-wrap;

Removing that rule, renders as I'd expect without the additional vertical space.

Render image tags

Is there a way for this support rendering images so they appear like datasette-render-images but using the markdown syntax ![Alt Text](path/to/image.jpg)? Right now for me, the table view displays the raw html <img alt="" src="/path/to/image.jpg"> without rendering the image itself.

Or is this a case where I'd be better of downloading all those images locally and using datasette-render-images?

Ensure links are output with href intact

May have a bug where Markdown [link](url) syntax isn't working because I have not correctly whitelisted a[href], eg on https://github-to-sqlite.dogsheep.net/github/releases/26096691 compared to https://github.com/simonw/sqlite-utils/releases/tag/2.7.1

Check what happens if extra attributes have been defined.

Support GitHub-Flavored Markdown and other extensions

https://py-gfm.readthedocs.io/en/latest/

diff --git a/datasette_render_markdown/__init__.py b/datasette_render_markdown/__init__.py
index 018f433..826b9b5 100644
--- a/datasette_render_markdown/__init__.py
+++ b/datasette_render_markdown/__init__.py
@@ -3,6 +3,7 @@ import bleach
 import markdown
 from datasette import hookimpl
 import jinja2
+from mdx_gfm import GithubFlavoredMarkdownExtension
 
 
 @hookimpl()
@@ -19,7 +20,11 @@ def render_cell(value, column):
 def render_markdown(value):
     html = bleach.linkify(
         bleach.clean(
-            markdown.markdown(value, output_format="html5"),
+            markdown.markdown(
+                value,
+                output_format="html5",
+                extensions=[GithubFlavoredMarkdownExtension()],
+            ),
             tags=[
                 "a",
                 "abbr",

Things that look like URLs in `<pre>` tags are being linkified

For example:

The Markdown for that was:

This query returns the names of everyone who has been a Vice President:

    select
      executives.name
    from
      executive_terms
      join executives on executive_terms.executive_id = executives.id
    where
      type = 'viceprez'

"Q&A against documentation" on the datasette.io homepage

Example of using template to mark up markdown in a variable

As a Jinja novice, if I have a template with some markdown referenced by Javascript variable, how do I style it?

Presumably, the plugin can't handle that?

The example in the docs shows how to style literal markdown, but one likely use case is trying to style markdown retrieved by by a SQL query request to the datasette API? The javascript variable is undefined if called naively as {{render_markdown(md)}}. Is there a way to write the query with a function that takes markdown and returns HTML, eg select render_markdown(md) from etc.

Crashes on `None`

https://github.com/simonw/datasette.io/actions/runs/5863096422/job/15895990775

  File "/opt/hostedtoolcache/Python/3.9.17/x64/lib/python3.9/site-packages/datasette_render_markdown/__init__.py", line 79, in render_markdown
    markdown.markdown(value, output_format="html5", extensions=extensions or [])
  File "/opt/hostedtoolcache/Python/3.9.17/x64/lib/python3.9/site-packages/markdown/core.py", line 377, in markdown
    return md.convert(text)
  File "/opt/hostedtoolcache/Python/3.9.17/x64/lib/python3.9/site-packages/markdown/core.py", line 238, in convert
    if not source.strip():
AttributeError: 'NoneType' object has no attribute 'strip'