remarkjs / remark Goto Github PK

markdown processor powered by plugins part of the @unifiedjs collective

License: MIT License

JavaScript 100.00%

markdown ast javascript unified remark commonmark

remark's Introduction

remark is a tool that transforms markdown with plugins. These plugins can inspect and change your markup. You can use remark on the server, the client, CLIs, deno, etc.

Feature highlights

compliant — 100% to CommonMark, 100% to GFM or MDX with a plugin
ASTs — inspecting and changing content made easy
popular — world’s most popular markdown parser
plugins — 150+ plugins you can pick and choose from

Intro

remark is an ecosystem of plugins that work with markdown as structured data, specifically ASTs (abstract syntax trees). ASTs make it easy for programs to deal with markdown. We call those programs plugins. Plugins inspect and change trees. You can use the many existing plugins or you can make your own.

to learn markdown, see this cheatsheet and tutorial
for more about us, see unifiedjs.com
for updates, see Twitter
for questions, see support
to help, see contribute or sponsor below

What is this?
When should I use this?
Plugins
Examples
Syntax
Syntax tree
Types
Compatibility
Security
Contribute
Sponsor
License

What is this?

With this project and a plugin, you can turn this markdown:

# Hello, *Mercury*!

…into the following HTML:

<h1>Hello, <em>Mercury</em>!</h1>

Show example code

import rehypeStringify from 'rehype-stringify'
import remarkParse from 'remark-parse'
import remarkRehype from 'remark-rehype'
import {unified} from 'unified'

const file = await unified()
  .use(remarkParse)
  .use(remarkRehype)
  .use(rehypeStringify)
  .process('# Hello, *Mercury*!')

console.log(String(file)) // => '<h1>Hello, <em>Mercury</em>!</h1>'

With another plugin, you can turn this markdown:

# Hi, Saturn!

…into the following markdown:

## Hi, Saturn!

Show example code

import remarkParse from 'remark-parse'
import remarkStringify from 'remark-stringify'
import {unified} from 'unified'
import {visit} from 'unist-util-visit'

const file = await unified()
  .use(remarkParse)
  .use(myRemarkPluginToIncreaseHeadings)
  .use(remarkStringify)
  .process('# Hi, Saturn!')

console.log(String(file)) // => '## Hi, Saturn!'

function myRemarkPluginToIncreaseHeadings() {
  /**
   * @param {import('mdast').Root} tree
   */
  return function (tree) {
    visit(tree, function (node) {
      if (node.type === 'heading') {
        node.depth++
      }
    })
  }
}

You can use remark for many different things. unified is the core project that transforms content with ASTs. remark adds support for markdown to unified. mdast is the markdown AST that remark uses.

This GitHub repository is a monorepo that contains the following packages:

remark-parse — plugin to take markdown as input and turn it into a syntax tree (mdast)
remark-stringify — plugin to take a syntax tree (mdast) and turn it into markdown as output
remark — unified, remark-parse, and remark-stringify, useful when input and output are markdown
remark-cli — CLI around remark to inspect and format markdown in scripts

When should I use this?

Depending on the input you have and output you want, you can use different parts of remark. If the input is markdown, you can use remark-parse with unified. If the output is markdown, you can use remark-stringify with unified If both the input and output are markdown, you can use remark on its own. When you want to inspect and format markdown files in a project, you can use remark-cli.

If you just want to turn markdown into HTML (with maybe a few extensions), we recommend micromark instead.

If you don’t use plugins and want to deal with syntax trees manually, you can use mdast-util-from-markdown and mdast-util-to-markdown.

Plugins

remark plugins deal with markdown. Some popular examples are:

remark-gfm — add support for GFM (GitHub flavored markdown)
remark-lint — inspect markdown and warn about inconsistencies
remark-toc — generate a table of contents
remark-rehype — turn markdown into HTML

These plugins are exemplary because what they do and how they do it is quite different, respectively to extend markdown syntax, inspect trees, change trees, and transform to other syntax trees.

You can choose from the 150+ plugins that already exist. Here are three good ways to find plugins:

awesome-remark — selection of the most awesome projects
List of plugins — list of all plugins
remark-plugin topic — any tagged repo on GitHub

Some plugins are maintained by us here in the @remarkjs organization while others are maintained by folks elsewhere. Anyone can make remark plugins, so as always when choosing whether to include dependencies in your project, make sure to carefully assess the quality of remark plugins too.

Examples

Example: turning markdown into HTML

remark is an ecosystem around markdown. A different ecosystem is for HTML: rehype. The following example turns markdown into HTML by combining both ecosystems with remark-rehype:

import rehypeSanitize from 'rehype-sanitize'
import rehypeStringify from 'rehype-stringify'
import remarkParse from 'remark-parse'
import remarkRehype from 'remark-rehype'
import {unified} from 'unified'

const file = await unified()
  .use(remarkParse)
  .use(remarkRehype)
  .use(rehypeSanitize)
  .use(rehypeStringify)
  .process('# Hello, Neptune!')

console.log(String(file))

Yields:

<h1>Hello, Neptune!</h1>

Example: support for GFM and frontmatter

remark supports CommonMark by default. Non-standard markdown extensions can be enabled with plugins. The following example adds support for GFM (autolink literals, footnotes, strikethrough, tables, tasklists) and frontmatter (YAML):

import rehypeStringify from 'rehype-stringify'
import remarkFrontmatter from 'remark-frontmatter'
import remarkGfm from 'remark-gfm'
import remarkParse from 'remark-parse'
import remarkRehype from 'remark-rehype'
import {unified} from 'unified'

const doc = `---
layout: solar-system
---

# Hi ~~Mars~~Venus!
`

const file = await unified()
  .use(remarkParse)
  .use(remarkFrontmatter)
  .use(remarkGfm)
  .use(remarkRehype)
  .use(rehypeStringify)
  .process(doc)

console.log(String(file))

Yields:

<h1>Hi <del>Mars</del>Venus!</h1>

Example: checking markdown

The following example checks that markdown code style is consistent and follows recommended best practices:

import {remark} from 'remark'
import remarkPresetLintConsistent from 'remark-preset-lint-consistent'
import remarkPresetLintRecommended from 'remark-preset-lint-recommended'
import {reporter} from 'vfile-reporter'

const file = await remark()
  .use(remarkPresetLintConsistent)
  .use(remarkPresetLintRecommended)
  .process('1) Hello, _Jupiter_ and *Neptune*!')

console.error(reporter(file))

Yields:

          warning Missing newline character at end of file final-newline             remark-lint
1:1-1:35  warning Marker style should be `.`               ordered-list-marker-style remark-lint
1:4       warning Incorrect list-item indent: add 1 space  list-item-indent          remark-lint
1:25-1:34 warning Emphasis should use `_` as a marker      emphasis-marker           remark-lint

⚠ 4 warnings

Example: checking and formatting markdown on the CLI

The following example checks and formats markdown with remark-cli, which is the CLI (command line interface) of remark that you can use in your terminal. This example assumes you’re in a Node.js package.

First, install the CLI and plugins:

npm install remark-cli remark-preset-lint-consistent remark-preset-lint-recommended remark-toc --save-dev

…then add an npm script in your package.json:

  /* … */
  "scripts": {
    /* … */
    "format": "remark . --output",
    /* … */
  },
  /* … */

💡 Tip: add ESLint and such in the format script too.

The above change adds a format script, which can be run with npm run format. It runs remark on all markdown files (.) and rewrites them (--output). Run ./node_modules/.bin/remark --help for more info on the CLI.

Then, add a remarkConfig to your package.json to configure remark:

  /* … */
  "remarkConfig": {
    "settings": {
      "bullet": "*", // Use `*` for list item bullets (default)
      // See <https://github.com/remarkjs/remark/tree/main/packages/remark-stringify> for more options.
    },
    "plugins": [
      "remark-preset-lint-consistent", // Check that markdown is consistent.
      "remark-preset-lint-recommended", // Few recommended rules.
      [
        // Generate a table of contents in `## Contents`
        "remark-toc",
        {
          "heading": "contents"
        }
      ]
    ]
  },
  /* … */

👉 Note: you must remove the comments in the above examples when copy/pasting them as comments are not supported in package.json files.

Finally, you can run the npm script to check and format markdown files in your project:

npm run format

Syntax

Markdown is parsed and serialized according to CommonMark. Other plugins can add support for syntax extensions.

We use micromark for our parsing. See its documentation for more information on markdown, CommonMark, and extensions.

Syntax tree

The syntax tree used in remark is mdast. It represents markdown constructs as JSON objects.

This markdown:

## Hello *Pluto*!

…yields the following tree (positional info remove for brevity):

{
  type: 'heading',
  depth: 2,
  children: [
    {type: 'text', value: 'Hello '},
    {type: 'emphasis', children: [{type: 'text', value: 'Pluto'}]}
    {type: 'text', value: '!'}
  ]
}

Types

The remark organization and the unified collective as a whole is fully typed with TypeScript. Types for mdast are available in @types/mdast.

For TypeScript to work, it is important to type your plugins. For example:

/**
 * @typedef {import('mdast').Root} Root
 * @typedef {import('vfile').VFile} VFile
 */

/**
 * @typedef Options
 *   Configuration.
 * @property {boolean | null | undefined} [someField]
 *   Some option (optional).
 */

/**
 * My plugin.
 *
 * @param {Options | null | undefined} [options]
 *   Configuration (optional).
 * @returns
 *   Transform.
 */
export function myRemarkPluginAcceptingOptions(options) {
  /**
   * Transform.
   *
   * @param {Root} tree
   *   Tree.
   * @param {VFile} file
   *   File
   * @returns {undefined}
   *   Nothing.
   */
  return function (tree, file) {
    // Do things.
  }
}

Compatibility

Projects maintained by the unified collective are compatible with maintained versions of Node.js.

When we cut a new major release, we drop support for unmaintained versions of Node. This means we try to keep the current release line compatible with Node.js 16.

Security

As markdown can be turned into HTML and improper use of HTML can open you up to cross-site scripting (XSS) attacks, use of remark can be unsafe. When going to HTML, you will combine remark with rehype, in which case you should use rehype-sanitize.

Use of remark plugins could also open you up to other attacks. Carefully assess each plugin and the risks involved in using them.

For info on how to submit a report, see our security policy.

Contribute

See contributing.md in remarkjs/.github for ways to get started. See support.md for ways to get help. Join us in Discussions to chat with the community and contributors.

This project has a code of conduct. By interacting with this repository, organization, or community you agree to abide by its terms.

Sponsor

Support this effort and give back by sponsoring on OpenCollective!

Vercel	Motif		HashiCorp		GitBook		Gatsby

Netlify	Coinbase	ThemeIsle	Expo	Boost Note	Markdown Space	Holloway
	You?

License

MIT © Titus Wormer

remark's People

Contributors

Stargazers

Watchers

Forkers

spmjs eush77 ulrikstrid why-jay anandthakker gitter-badger minodisk zkochan boguan deanishe sfrdmn jpeer264 sahwar mizchi catesandrew ohtake mattcreager kgryte sweptr richardlitt kyleamathews modulexcite binhndicts joseroubert08 alex-e-leon tteltrab nokome rokt33r sethvincent ondinerhi simov barkinet niilante robomatic vhf stevenxl ruifortes davidtheclark quantizor dherges pabloleon brendo djm erquhart marfuzzi thomascullen michaelisprihanto tbroadley whowhenwheredev ikatyang kwangkim mlrawlings christianmurphy lvl99 streamich jessepinho alesanchezr devongovett darklightblue hamms qfox damianofusco rubys transitive-bullshit jeffluong pauloptimizely seafoam6 noahprince22 joe223 johnking bjudson strugee jaredk3nt tada1 wconnorwalsh calebissharp rexxars zslabs imcuttle antialiasis staltz trott korolev jashmenn sbugert jjm31601394 ghsyeung swizec mike-north cherishsince aocenas mohammedgmgn jake-low hydro47 mintutu restingtsunami zwz oraykt millette alexeykuzmin

remark's Issues

Parse error in bullet with space before newline

I encoutered this.

> mdast.parse('- \n')

TypeError: Cannot read property 'length' of null
  at Parser.tokenizeList (/Users/mizchi/sandbox/mdast/lib/parse.js:299:25)
  at Parser.tokenizeBlock (/Users/mizchi/sandbox/mdast/lib/parse.js:1572:28)
  at Parser.parse (/Users/mizchi/sandbox/mdast/lib/parse.js:1225:14)
  at Object.parse (/Users/mizchi/sandbox/mdast/lib/parse.js:1733:50)
  at repl:1:8
  at REPLServer.replDefaults.eval (/Users/mizchi/.nodebrew/node/v0.10.33/lib/node_modules/coffee-script/lib/coffee-script/repl.js:33:42)
  at repl.js:239:12
  at Interface.<anonymous> (/Users/mizchi/.nodebrew/node/v0.10.33/lib/node_modules/coffee-script/lib/coffee-script/repl.js:66:9)
  at Interface.emit (events.js:117:20)
  at Interface._onLine (readline.js:202:10)
  at Interface._line (readline.js:531:8)
  at Interface._ttyWrite (readline.js:760:14)
  at ReadStream.onkeypress (readline.js:99:10)
  at ReadStream.emit (events.js:117:20)
  at emitKey (readline.js:1095:12)
  at ReadStream.onData (readline.js:840:14)
  at ReadStream.emit (events.js:95:17)
  at ReadStream.<anonymous> (_stream_readable.js:764:14)
  at ReadStream.emit (events.js:92:17)
  at emitReadable_ (_stream_readable.js:426:10)
  at emitReadable (_stream_readable.js:422:5)
  at readableAddChunk (_stream_readable.js:165:9)
  at ReadStream.Readable.push (_stream_readable.js:127:10)
  at TTY.onread (net.js:528:21)

Github-flavored markdown html incompatibility

FYI, mdast does not parse HTML the way Github itself does. More specifically, it doesn't parse invalid HTML the same way Github does, or at least invalid HTML comments. If you have an HTML comment containing --, Github ignores this invalidity and still treats the overall comment as HTML and doesn't turn it into a paragraph.

I would say this is a bug rather than a feature, since no user-facing tool I've tried (e.g. Marked 2, MarkdownPad, MacDown, etc.) ever insists on HTML being valid HTML and reverting it to a paragraph otherwise. Likewise, of the parsers I've tried, mdast seems to be unique in this respect.

Is there an "encode" method to insert escaped text into the AST?

I sometimes have to insert text-as-is into the AST, e.g. I need to insert (Taylor, Stouffer, & Meehl, 2011) in a way that this exact text turns up in the markdown rendered to HTML. For this I need to insert something like $Taylor, Stouffer, & Meehl, 2011$. Can mdast do this for me? Or should I use something like markdown-escape?

range/location support?

Hi!

I notice that CommonMark's AST has been implemented location info(but it is unstable).

{ t: 'Document',
  start_line: 1,
  start_column: 1,
  end_line: 20,
  children: []
}

An example is azu/commonmark-ast-sandbox.

Do you have any plans to support range or location on AST(like Esprima)?

Nested tasklist

Here is trivial difference.

- [x] aaa
  - [ ] bbb
  - [ ] ccc

aaa
- bbb
- ccc

It looks mdast doesn't handle nested tasklist.

Should be able to expose style information

Probably not by default;
Information like which emphasis markers are used, asterisks or underscores;
This would highly benefit the creation of something mdlint-like.

Make it easier for plugins to add tokenizers to the parser

Looking here and here it seems like I need to have intimate knowledge of the how the parser works in order to define regular expressions to tokenize. The use case is detecting and linking URLs (auto-linking) and @mentions. Some of the URLs I'd like to turn into special node types – such as "twitter", which another plugin could render as HTML for an embedded tweet.

Ideally I could write a plugin that only has to specify a regular expression, a function which returns the node, and some rules about scope (for example, I wouldn't want to create a link for a URL that is already inside a link).

Stringify: Preferred link-style

Inline- or reference-styls

Should decode HTML entities

Such as, & in AT&T.

Should support plugins

Probably ware, retext, duo, like

Avoid using peerDependencies

I'm trying to force mdast-react to use 0.26.2 or newer because of the recently-fixed parsing bugs. Doing so results in a

~/src/mdast-react〉npm install
npm ERR! Darwin 14.3.0
npm ERR! argv "node" "/usr/local/bin/npm" "install"
npm ERR! node v0.12.6
npm ERR! npm  v2.12.1
npm ERR! code EPEERINVALID

npm ERR! peerinvalid The package mdast does not satisfy its siblings' peerDependencies requirements!
npm ERR! peerinvalid Peer [email protected] wants mdast@>=0.22.0
npm ERR! peerinvalid Peer [email protected] wants mdast@>=0.22.0
npm ERR! peerinvalid Peer [email protected] wants mdast@>=0.25.0
npm ERR! peerinvalid Peer [email protected] wants mdast@>=0.24.0
npm ERR! peerinvalid Peer [email protected] wants mdast@>=0.22.0

npm ERR! Please include the following file with any support request:
npm ERR!     /Users/tmcw/src/mdast-react/npm-debug.log

Combine with npm moving away from peerDependencies, it would be awesome to use normal ol' dependencies rather than peerDependencies to string mdast packages together.

Bullet parser does not follow common mark

CommonMark Spec

In the first place, in gfm rule, Github does not follow common mark specs. Anyway default case is no problem to use.

But in commonmark: true I think it should follow specs. How do you think?

Transformer should not rely on mutated object

Take the following abbreviated sample of an embedded plugin:

// This will only return the first element in the .md
const processor = mdast().use(function (mdst, opt) {
  function transformer(ast, file) {
    ast.children = ast.children.slice(0, 1);
  }
  return transformer;
});
return processor.process(data);

In this example, the transformer method is expected to mutate the incoming parameters, ast and file. This had me confused for quite a while as it is commonly considered a best-practice to keep parameters immutable. Due to expecting transformer to return the tranformed objects and not seeing it in any of your plugins, I was thrown a bit. The transformer doesn't actually do anything with its returned object.

A more optimum approach would be something like this:

// This will only return the first element in the .md
const processor = mdast().use(function (mdst, opt) {
  function transformer(ast, file) {
    var mutatedAst = ast.children.slice(0, 1);
    return mutatedAst;
  }
  return transformer;
});
return processor.process(data);

While I understand two parameters are in play, they should probably be returned grouped together as an object. The point is that one should not expect the user to mutate incoming parameters and not even return a result, which is a basic in functional programming.

I know correcting this would probably break other plugins: perhaps you could schedule it in to the next major release?

Ways to use global mdast plugins with CLI?

What is the preferred way of using globally installed plugins with CLI?

$ echo "# hello" | mdast -u mdast-html
# hello

<stdin>
        1:1  error    Error: Cannot find module 'mdast-html'

It just worked before. I found two ways of working around it.

Including $(npm root -g) in $NODE_PATH:

$ echo "# hello" | env NODE_PATH="$(npm root -g):$NODE_PATH" mdast -u mdast-html

Specifying full path to a plugin:

$ echo "# hello" | mdast -u "$(npm root -g)/mdast-html"

Both ways are somewhat clumsy. Is there a simpler way of doing that or some relevant configuration option?

Can't parse html tag correctly

can parse <div>div</div> and <pre>pre</pre>
<a>foo</a> and <span>foo</span>

It looks inline tag can't be parsed.

coffee> mdast.parse('<a>foo</a>').children[0]
{ type: 'paragraph',
  children: 
   [ { type: 'html',
       value: '<a>',
       position: [Object] },
     { type: 'text',
       value: 'foo',
       position: [Object] },
     { type: 'html',
       value: '</a>',
       position: [Object] } ],
  position: 
   { start: { line: 1, column: 1 },
     end: { line: 1, column: 11 } } }

Add website

mdast should have a cool website!

Maybe http://mdast.md? http://mdast.js.org (free)?, or just at GitHub (free)?

Also: should be good looking and useful.

Should have a CLI

This would be extra powerful with plugins (like duo)

Inverse order of attachers when passed in array

Looking at index.js, it seems that

mdast.use([plugin1, plugin2, plugin3])

is equivalent to

mdast.use(plugin3).use(plugin2).use(plugin1)

which is counter-intuitive if you ask me.

Why is it so?

Positions of fenced vs. unfenced code

Hi. I'm in the middle of switching from marked to mdast for parsing in my mockdown library. I've run into a slight snag, however, which is that mdast gives the start position of a fenced code block as the line where the backquotes are, but gives the start position of an indented code block as the line where the actual code starts.

When I was using marked, this wasn't a problem because I could detect the absence of a lang property to know that a code block was indented rather than fenced, and the presence of the attribute (even if null) to know when I need to offset the code's line position by 1. But mdast creates the property with a null value on indented blocks as well as on fenced ones, so there is no way for me to know whether to offset the line number.

Well, technically, there is: I can count the number of lines in the code node's value, and compare this to the number of lines in the node's position range, and if it's 2 less, I know it's a fenced code block and can offset the start position of the code accordingly.

This seems a bit fragile, though, so I was wondering if there can be some official way to do this. That is, to either be able to tell the two kinds of code blocks apart (e.g. via a fenced property), or to have the position of a code block be registered as the position where the code starts, rather than the position where the code's block wrapper starts.

Heck, just allowing an empty string for lang when it's a fenced block without a language would work for me. The main point is just to have an officially supported way to be able to know what line number the actual code of a code node begins on, whether the block is indented or fenced.

Thanks!

Add test for multiple footnotes to the same definition

Seems to work currently, when typing the following in the demo.

Here’s a footnote[^1] and such[^1].

[^1]: This one’s also a footnote.

…but it seems to fail when inlining footnotes.

Do not output blank lines between definitions

Source and reprocessed versions at https://gist.github.com/anonymous/3bf6b6095f73702c187d

ouput wrong position when passing empty string

var mdast = require('mdast');
var emptyString = "";
var ast = mdast.parse(emptyString);
console.log(JSON.stringify(ast));
/*
{
    "type": "root",
    "children": [],
    "position": {
        "start": {
            "line": 1,
            "column": 1
        }
    }
}
*/
// position.end is undeifned

Example : http://requirebin.com/?gist=ad3c34ef897338867009

Expected:

{
    "type": "root",
    "children": [],
    "position": {
        "start": {
            "line": 1,
            "column": 1
        },
        "end": {
            "line": 1,
            "column": 1
        }
    }
}

Actual:

position.end is undeifned

Add list-item-indent stringification option

...which defaults to "tab-size", for greatest support, but also accepts "mixed" and "1".

Supersedes GH-30.

Add `style` properties on nodes

Currently, only global stringification settings, such as bullet, are supported. I’d like to extend stringification style to per-node settings. Thus, a list-item can have a style.bullet = ‘*' property.

Something like:

heading nodes have an enum headingStyle property set to "atx",
"atx-closed", or "setext";
tables nodes have a boolean looseTable property;
tables nodes have a boolean spacedTable property;
code nodes have a nullable enum fenceMarker property set to ""or "~"`;
code nodes have a boolean fences property;
listItem nodes have an enum listItemBullet property set to *, -,
+, ., or ).
listItem nodes have a nullable listItemIndex property set to an integer;
horizontalRule nodes have an enum ruleMarker property set to *, -, or
_.
horizontalRule nodes have a boolean ruleRepetition property;
horizontalRule nodes have a boolean ruleSpaces property;
strong and emphasis nodes have an enum emphasisMarker property
set to _ or *.

These should be overwritten when a setting is given to mdast (this allows
mdast to fix code-style), but overwrite the default values noted in
mdast.process()

Supersedes GH-30.

LInk parser lowercases identifiers

When I parse [][@TayEA11], the resulting AST is

{
  "type": "root",
  "children": [
    {
      "type": "paragraph",
      "children": [
        {
          "type": "linkReference",
          "identifier": "@tayea11",
          "referenceType": "full",
          "children": [],
          "position": {
            "start": {
              "line": 1,
              "column": 1
            },
            "end": {
              "line": 1,
              "column": 13
            },
            "indent": []
          }
        }
      ],
      "position": {
        "start": {
          "line": 1,
          "column": 1
        },
        "end": {
          "line": 1,
          "column": 13
        },
        "indent": []
      }
    }
  ],
  "position": {
    "start": {
      "line": 1,
      "column": 1
    },
    "end": {
      "line": 1,
      "column": 13
    }
  }
}

Is there a setting that keeps the casing of identifiers?

Blank cell in table can not be parsed

| | a|c|
|--|:----:|:---|
|a|b|c|
|a|b|c|

1:3: Incorrectly eaten value: please report this warning on http://git.io/vUYWz

Lifecycle events for plugins

Hey! Great work on mdast, it's really rad. I'm using it to set up a build system for the Node.js documentation WG. As part of that effort, I started building count-docula, which currently consumes mdast and presents its own CLI. If possible, I'd love to make count-docula just another plugin that mdast consumes.

What count-docula is currently doing:

Given a directory, it collects every markdown file within that directory.
- This duplicates work from mdast's CLI.
For each markdown file, the plugin looks for three directives (import, export, and anchor.)
- Anchors are user-defined ids that are assigned to the closet parent block element — they're there so that heading text can be changed independent of links, and so that links can be tracked and verified across documents.
- Once all anchors are found, then all exports are determined. These are links that will be made available when "importing" the current document.
- Finally, the import directives are hit.
  - Importantly, import directives are able to bring in documents from outside the original working set.
The plugin artificially blocks process from completing (using a function passed as an option) until all documents have been visited, and their anchors, exports, and imports declared.
- Warnings are added at this stage for unknown|duplicate reference link definitions, bad imports, and bad exports.
Once all documents have been visited & resolved, the plugin continues to the "render" or "test" task.
- The test task augments lint with a test checking to see that no documents in the original working set are "orphaned" — only one document in the original working set may have no incoming links.
  - Otherwise, this step replicates much of mdast's CLI machinery.
- The build task accepts a template for rendering the document into, but otherwise works the same as mdast's CLI machinery.

In order to turn count-docula into a plugin:

mdast's plugin API would need a lifecycle event for "the CLI has collected all of the docs in this dir." That event may be asynchronous, so mdast should delegate to the plugin before continuing (via a callback or other method.)
The directory set API may have to be capable of adding new source md document paths and making the resulting ASTs available to the plugin.

Something like:

module.exports = attacher(md, opts) {
  md.onDocsCollected((workingSet, next) => {
    // workingSet is an "array-ish" set of all of the `File` objects that
    // mdast's cli found.
    workingSet.parseEach(({filename, ast}, next) => {
      // search for documents to import from the AST
      workingSet.add('some/new/path')
      next()
    }, function(err) {
      workingSet.forEach(({filename, ast}) => {
        // resolve all of the links, then let `mdast` know that
        // the workingSet's files are ready to be rendered / tested / etc.
        // if the workingSet's files were parsed, use those asts
        // instead of parsing again. Otherwise parse them.
        next()
      })
    })
  })
}

Of course, there's zero pressure to do this — or if you'd like I would be happy to take a stab at implementing it. A workingSet API seems like a natural place to add meta information for other plugins, as well — for example, providing a template/framing API for mdast-html.

Thanks again, and great work on mdast!

Could we add more code samples in manpages?

https://github.com/wooorm/mdast/blob/master/doc/mdastplugin.3.md

In this page, I am trying very hard to understand how to create and manipulate plugins. While I know concepts have been organized in a very able way, you are introducing a log of new terms as attacher, transformer and completer.

There is but one code example on creating a plugin, and this only implements the transformer. Code samples are are to coders as pictures are to anyone else in a manual. They put things in perspective. I am having a very hard time figuring out how to implement these definitions above, and if more snippits were given, I am sure this could be simplified greatly.

Example:

To access all files once they are transformed, create a completer. A completer is invoked before files are compiled, written, and logged, but after reading, parsing, and transforming. Thus, a completer can still change files or add messages.

Where does one create a signature? A simple example would be worth another 5 paragraphs of description.

Print warnings & errors to stderr?

One of the examples in mdast --help is not working correctly:

$ cat readme.md
- 1
- 2
$ cat readme.md | mdast -s 'setext: true, bullet: "*"' > readme-new.md
$ cat readme-new.md
*   1
*   2

<stdin>: no issues found

Add an option to output something when no messages are found

Spin off from #57

Create mdast-html

One of the major things to do is create a plug-in which compiles an mdast AST into HTML.

This plug-in would be a great way to test how applicable the AST is for heavy duty transpiling into another language.

Want a "don't merge HTML nodes" option

Sometimes merged HTML nodes get in my way when transforming AST into vertual DOM.

We can't just split a seemingly-merged HTML node by /\n\n/ because doing so breaks <div>text\n\n</div>[1] in <div>text and </div>. Though I'm fine with nodes whose value is simple tag (<div>, </div>) or balanced fragment (<div>text</div>), something like <div>text is not very acceptable.

[1] it can be obtained by parsing this Markdown document:

<div>text

</div>

Add `--file-path` cli flag for stdin

Just for some nice logging, I can imagine it to be useful by third party cli-engine users (projects which require mdast/cli);

Add support for CLI plugins

I can imagine other tools would want to:

Add extensions;
Add settings.

Fix demo

The current demo is horrible. It’s slow, not that useful, and more.

It should be good looking;
It should use a faster editor;
it should be user-friendly.

Add missing/invalid footnotes/links to ast

This would enable mdlint-like tools to raise an issue when a definition is forgotten.

uglify breaks mdast

I've just created a testcase-repo to replicate this one because it's very weird:

https://github.com/tmcw/mdast-uglify-bug

The jist is that UglifyJS causes mdast to fail on processing input that it would otherwise be able to process.

Watching files

Hi, thanks for your work. I'm trying to use mdast-lint, and am thinking it'd be wonderful to have something like a --watch option built into mdast.

Stringify: Preferred fence style

Tildes (~) or ticks (```).

Cannot distinguish `|---|` and `|:---|`

mdast parses un-aligned table column (|---|) as left-aligned, as well as |:---|. This makes it impossible to emulate GitHub's Markdown renderer -- it renders the header of un-aligned table column center, and the body left, by leaving their text-align style unspecified:

|un-aligned(center)|center|left|
|---|:---:|:---|
|Lorem ipsum dolor sit amet|Lorem ipsum dolor sit amet|Lorem ipsum dolor sit amet|
|un-aligned(left)|center|left|

↓

un-aligned(center)	center	left
Lorem ipsum dolor sit amet	Lorem ipsum dolor sit amet	Lorem ipsum dolor sit amet
un-aligned(left)	center	left

Store all links in central place, not just referenced links

This would make sure just one reference is created when stringifying with referenceLinks: true:

[a link][link] and [another link](http://example.com)

[link]: http://example.com

Yields:

[a link][1] and [another link][2]

[1]: http://example.com
[2]: http://example.com

Extending grammar

How would one extend the parsers grammar? I understand that I can create a plugin and create a parser that inherits from mdast's parser, but writing the tokenizer and whatever else is needed is unclear.

Do you mind helping me out with one example?

Let's say I have some custom markdown that looks like this:

+++small

SOME TEXT CONTENT

+++

How would one add this grammar to the parser such that content enclosed in +++ is marked as children? For example:

{
  type: MY_CUSTOM_TYPE, // captured by enclosing +++
  size: 'small',
  children: [{
    type: 'text'
    ....
  }]
}

I'm open to ideas if you have a better idea for how the ast should look. You're certainly more expert than I am. :)

Thanks for your time.

Fix CLI-settings

Currently, it’s impossible to pass nested objects or arrays because the parsing system is way too simple. This should be changed to accepting just JSON.

Something like mdast . -s 'foo: {bar: "baz"}'?

Should expose footnote definitions as a node.

An object Instead of an array:

   "footnotes": {
-    "1": [
-      {
-        "type": "paragraph",
-        "children": [
-          {
-            "type": "text",
-            "value": "A footnote."
-          }
-        ]
-      }
-    ]
+    "1": {
+      "type": "footnoteDefinition",
+      "id": "1",
+      "children": [
+        {
+          "type": "paragraph",
+          "children": [
+            {
+              "type": "text",
+              "value": "A footnote"
+            }
+          ]
+        }
+      ]
+    }
   }

Why 3 spaces after list bullet?

* list item

add 2 space use stringify:

-    list item

I just want 1 space after list bullet.

But I can't find any options from https://github.com/wooorm/mdast/blob/master/doc/options.md#list-item-bullets

and I found this in source code:

$ grep -r "'   '" node_modules/mdast
node_modules/mdast/node_modules/concat-stream/node_modules/readable-stream/node_modules/core-util-is/float.patch:-            return '   ' + line;

Add support for tab characters

To enable CommonMark’s tab expansion by dependants.

1.0.0?

mdast is currently on a semver "unstable" 0.x.x version. Is that intentional?

It seems to have tests with full coverage, no open issues, no recent API breaks (though I haven't checked very carefully), and a little bunch of dependent packages.

What's blocking a stable release?

Paragraph `mdast.stringify` creates line-breaks on return

When invoking mdast.stringify on a paragraph node and all of its child nodes, it renders the original paragraph with line breaks. Example:

This is a markdown pargraph with a [link](http://this-page-intentionally-left-blank.org) to something silly.

On stringifying this, one gets:

This is a markdown pargraph with a 
[link](http://this-page-intentionally-left-blank.org)
 to something silly.

Refactor breaks in CommonMark

They’re currently added as an escape node ({type: 'escape', value: '\n'}), but should be added as {type: 'break'}.

This should be accompanied by a stringily option to either use CommonMark style, or trailing-space style.

Should accept empty fenced code blocks

With default options, the following…

Before

```one
```

And

```two
```

Yields:

Before

````one
```

And

```two
````

remarkjs / remark Goto Github PK

remark's Introduction

Feature highlights

Intro

Contents

What is this?

When should I use this?

Plugins

Examples

Example: turning markdown into HTML

Example: support for GFM and frontmatter

Example: checking markdown

Example: checking and formatting markdown on the CLI

Syntax

Syntax tree

Types

Compatibility

Security

Contribute

Sponsor

License

remark's People

Contributors

Stargazers

Watchers

Forkers

remark's Issues

Recommend Projects

Recommend Topics

Recommend Org