micromark / micromark Goto Github PK
View Code? Open in Web Editor NEWsmall, safe, and great commonmark (optionally gfm) compliant markdown parser
Home Page: https://unifiedjs.com
License: MIT License
small, safe, and great commonmark (optionally gfm) compliant markdown parser
Home Page: https://unifiedjs.com
License: MIT License
A helloworld is required on how to write an extension.
The real extensions are too complex for quick reference.
A real simple extension.
Say, I want to write an extension to make link external. Still digging around, maybe it is a good place for helloworld - modify attributes?
Honestly I haven't debug this at all, however I'm hopeful that @wooorm will be kind enough to take a look, and probably figure out what's going on very quickly :)
The following markdown should produce a table (it does in github for example), however micromark parses it as text.
Gist: https://gist.github.com/diervo/931d9f68a08922efb3341c6faff3caea
micromark 3.2
https://codesandbox.io/s/trusting-worker-lt4scv
nested ordered lists are not parsed correctly, if they don't start with a 1.
they work in gh:
nested lists are parsed correctly on any level.
also see failing test: main...adobe-rnd:micromark:nested-lists-test
if a nested ordered list doesn't start with 1.
it is not parsed as list
Node v16
No response
No response
No response
4.0.0
No response
user@HOST micromark-setext % npm ls micromark
micromark-setext@ /Users/user/Documents/micromark-setext
└── [email protected]
user@HOST micromark-setext % cat issue.mjs
import { parse } from "micromark";
import { postprocess } from "micromark";
import { preprocess } from "micromark";
const markdown = `
Text
Setext
======
Text
`;
const encoding = undefined;
const end = true;
const options = undefined;
const chunks = preprocess()(markdown, encoding, end);
const parseContext = parse(options).document().write(chunks);
const events = postprocess(parseContext);
for (const event of events) {
const [ kind, token, context ] = event;
if (kind === "enter") {
const { type, start, end } = token;
const { "line": startLine } = start;
const { "line": endLine } = end;
console.dir(`${type} (${startLine}-${endLine}): ${context.sliceSerialize(token)}`);
}
}
user@HOST micromark-setext % node issue.mjs
'lineEndingBlank (1-2): \n'
'content (2-2): Text'
'paragraph (2-2): Text'
'data (2-2): Text'
'lineEnding (2-3): \n'
'lineEndingBlank (3-4): \n'
'setextHeading (4-5): Text\n\nSetext\n======'
'setextHeadingText (4-4): Setext'
'data (4-4): Setext'
'lineEnding (4-5): \n'
'setextHeadingLine (5-5): ======'
'setextHeadingLineSequence (5-5): ======'
'lineEnding (5-6): \n'
'lineEndingBlank (6-7): \n'
'content (7-7): Text'
'paragraph (7-7): Text'
'data (7-7): Text'
'lineEnding (7-8): \n'
user@HOST micromark-setext %
Note specifically this part of the output: 'setextHeading (4-5): Text\n\nSetext\n======'
While the start and end lines are correct, the output of sliceSerialize
includes "Text\n\n" from lines 2 and 3 which is not part of the heading (confirmed by the associated setextHeadingText
token which contains only "Setext").
See above.
Node v16
npm v7
macOS
No response
No response
Use micromark
in a Webpack app.
The app builds without configuration changes.
The build fails with:
WARNING in ../../node_modules/power-assert-formatter/lib/create.js 30:28-49
Critical dependency: the request of a dependency is an expression
@ ../../node_modules/power-assert-formatter/index.js 12:0-40
@ ../../node_modules/power-assert/index.js 15:16-49
@ ../../node_modules/micromark/dev/lib/create-tokenizer.js 27:0-33 213:4-22 217:4-10 223:4-22 231:4-22 236:4-10 285:4-10 286:4-25 298:4-10 299:4-25 302:4-10 305:4-22 311:4-10 414:10-16 464:8-26 472:8-26 508:4-10 576:4-10 577:4-10 652:
10-16
@ ../../node_modules/micromark/dev/lib/parse.js 14:0-53 49:13-28
and similar warnings for any package that now has a dependency on power-assert
.
There is a solution provided by power-assert
here, but it seems like it would hide other warnings that I would want to see and I don't think I should need to modify my Webpack config to get micromark
to work.
It would probably be better not to add power-assert
as a dependency in a patch release since it's likely to break many people's builds.
Also note that the main micromark
package uses power-assert
, but I think it's missing from the the package's dependencies.
Out of curiosity, what is the advantage of switching to power-assert
?
No response
No response
No response
No response
latest
No response
browserify/node-util#62 (comment)
Use console.assert
instead.
assert
is a node built-in module.
Node v14
yarn v1
macOS
Vite
Fuzz testing micromark, by itself without plugins (#18 modified)
const fs = require('fs')
const micromark = require('../index')
function fuzz(buf) {
try {
// focus on issues in files less than 1Mb
if (buf.length > 1000000) return
// write result in temp file in case unrecoverable exception is thrown
fs.writeFileSync('temp.txt', buf)
// commonmark buffer without html
micromark(buf)
} catch (e) {
throw e
}
}
module.exports = {
fuzz
}
after running through 10-30 files often crashes with:
<--- Last few GCs --->
[16841:0x4e8fc10] 11334 ms: Mark-sweep (reduce) 3664.6 (4118.7) -> 3664.6 (4118.7) MB, 162.9 / 0.0 ms (average mu = 0.067, current mu = 0.000) last resort GC in old space requested
[16841:0x4e8fc10] 11494 ms: Mark-sweep (reduce) 3664.6 (4115.7) -> 3664.6 (4116.7) MB, 160.5 / 0.0 ms (average mu = 0.033, current mu = 0.000) last resort GC in old space requested
<--- JS stacktrace --->
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
1: 0xa02dd0 node::Abort() [node]
2: 0x94e471 node::FatalError(char const*, char const*) [node]
3: 0xb7686e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
4: 0xb76be7 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
5: 0xd31485 [node]
6: 0xd43cf1 v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
7: 0xd09562 v8::internal::Factory::AllocateRaw(int, v8::internal::AllocationType, v8::internal::AllocationAlignment) [node]
8: 0xd033e4 v8::internal::FactoryBase<v8::internal::Factory>::AllocateRawWithImmortalMap(int, v8::internal::AllocationType, v8::internal::Map, v8::internal::AllocationAlignment) [node]
9: 0xd0b719 v8::internal::Factory::NewInternalizedStringImpl(v8::internal::Handle<v8::internal::String>, int, unsigned int) [node]
10: 0xf3169f v8::internal::StringTable::AddKeyNoResize(v8::internal::Isolate*, v8::internal::StringTableKey*) [node]
11: 0xf3fa16 v8::internal::Handle<v8::internal::String> v8::internal::StringTable::LookupKey<v8::internal::InternalizedStringKey>(v8::internal::Isolate*, v8::internal::InternalizedStringKey*) [node]
12: 0xf3fac6 v8::internal::StringTable::LookupString(v8::internal::Isolate*, v8::internal::Handle<v8::internal::String>) [node]
13: 0xb7644b v8::internal::LookupIterator::LookupIterator(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Name>, unsigned long, v8::internal::Handle<v8::internal::JSReceiver>, v8::internal::LookupIterator::Configuration) [node]
14: 0xee1809 v8::internal::LookupIterator::LookupIterator(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::LookupIterator::Key const&, v8::internal::LookupIterator::Configuration) [node]
15: 0x106d9f9 v8::internal::Runtime::SetObjectProperty(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, v8::internal::StoreOrigin, v8::Maybe<v8::internal::ShouldThrow>) [node]
16: 0x106eb07 v8::internal::Runtime_SetKeyedProperty(int, unsigned long*, v8::internal::Isolate*) [node]
17: 0x13fe259 [node]
timeout: the monitored command dumped core
Aborted
on an innocuous looking file, like
# Foo
| Name | GitHub | Twitter |
| ---- | ------ | ------- |
Run fuzzer from #18
no crash
<--- Last few GCs --->
[16841:0x4e8fc10] 11334 ms: Mark-sweep (reduce) 3664.6 (4118.7) -> 3664.6 (4118.7) MB, 162.9 / 0.0 ms (average mu = 0.067, current mu = 0.000) last resort GC in old space requested
[16841:0x4e8fc10] 11494 ms: Mark-sweep (reduce) 3664.6 (4115.7) -> 3664.6 (4116.7) MB, 160.5 / 0.0 ms (average mu = 0.033, current mu = 0.000) last resort GC in old space requested
<--- JS stacktrace --->
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
1: 0xa02dd0 node::Abort() [node]
2: 0x94e471 node::FatalError(char const*, char const*) [node]
3: 0xb7686e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
4: 0xb76be7 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
5: 0xd31485 [node]
6: 0xd43cf1 v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
7: 0xd09562 v8::internal::Factory::AllocateRaw(int, v8::internal::AllocationType, v8::internal::AllocationAlignment) [node]
8: 0xd033e4 v8::internal::FactoryBase<v8::internal::Factory>::AllocateRawWithImmortalMap(int, v8::internal::AllocationType, v8::internal::Map, v8::internal::AllocationAlignment) [node]
9: 0xd0b719 v8::internal::Factory::NewInternalizedStringImpl(v8::internal::Handle<v8::internal::String>, int, unsigned int) [node]
10: 0xf3169f v8::internal::StringTable::AddKeyNoResize(v8::internal::Isolate*, v8::internal::StringTableKey*) [node]
11: 0xf3fa16 v8::internal::Handle<v8::internal::String> v8::internal::StringTable::LookupKey<v8::internal::InternalizedStringKey>(v8::internal::Isolate*, v8::internal::InternalizedStringKey*) [node]
12: 0xf3fac6 v8::internal::StringTable::LookupString(v8::internal::Isolate*, v8::internal::Handle<v8::internal::String>) [node]
13: 0xb7644b v8::internal::LookupIterator::LookupIterator(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Name>, unsigned long, v8::internal::Handle<v8::internal::JSReceiver>, v8::internal::LookupIterator::Configuration) [node]
14: 0xee1809 v8::internal::LookupIterator::LookupIterator(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::LookupIterator::Key const&, v8::internal::LookupIterator::Configuration) [node]
15: 0x106d9f9 v8::internal::Runtime::SetObjectProperty(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, v8::internal::StoreOrigin, v8::Maybe<v8::internal::ShouldThrow>) [node]
16: 0x106eb07 v8::internal::Runtime_SetKeyedProperty(int, unsigned long*, v8::internal::Isolate*) [node]
17: 0x13fe259 [node]
timeout: the monitored command dumped core
Aborted
Some malformed URL can crash micromark
var micromark = require('micromark')
console.log(micromark('[](<%>)'))
originally detected with #18, credit to @wooorm for a more minimal repro
<p><a href="%25"></a></p>
URIError: URI malformed
at decodeURI (<anonymous>)
at normalizeUri (micromark/dist/util/normalize-uri.js:1:1040)
at url (micromark/dist/compile/html.js:1:54303)
at Object.onexitmedia (micromark/dist/compile/html.js:1:61812)
at done (micromark/dist/compile/html.js:1:50389)
at compile (micromark/dist/compile/html.js:1:48534)
at buffer (micromark/dist/index.js:1:2192)
at Worker.fuzz [as fn] (micromark/fuzzer.js:1:1781)
at process.<anonymous> (micromark/node_modules/jsfuzz/build/src/worker.js:63:30)
micromark
https://github.com/wataru-chocola/report-micromark-20210827
Run my PoC.
$ git clone https://github.com/wataru-chocola/report-micromark-20210827
$ cd report-micromark-20210827
$ npm install
$ npx node index.js
document
constructs are invoked twice in micromark/lib/initialize/document.js
:
from checkNewContainers
state
return effects.check(
containerConstruct,
thereIsANewContainer,
thereIsNoNewContainer
)(code)
from documentContinued
return effects.attempt(
containerConstruct,
containerContinue,
flowStart
)(code)
And I expect the first invocation effect.check(...)
doesn't make any modifications on events.
* @property {Attempt} check
* Attempt, then revert.
effect.check()
does modify events if construct is for document and has resolver.
My construct in PoC code dumps context.events at the start.
On 1st run (from effects.check
), we see the correct events which are generated by previous tokenization.
+ initialize tokenizer (runCount: 1)
+ previous events
[ 'enter', 'chunkFlow', 'term\n' ]
[ 'exit', 'chunkFlow', 'term\n' ]
+ run resolverTo
But on 2nd run (from effects.attempt
), events are modified by resolver in the previous check
execution.
+ initialize tokenizer (runCount: 2)
+ previous events
[ 'enter', 'defListTerm', 'term\n' ]
[ 'enter', 'chunkFlow', 'term\n' ]
No response
No response
No response
No response
3
No response
In chrome's console run:
const mm = await import('https://esm.sh/micromark@3?bundle');
console.log(mm.micromark('List1\n* item1\n* item2\n\n\n\n'));
console.log('------');
console.log(mm.micromark('List1\n* item1\n* item2\n\n\n \n'));
Note the only difference between the two examples is a single space some blank lines away from the list. Those two examples return different html, the latter has the list elements wrapped in <p>
<p>List1</p>
<ul>
<li>item1</li>
<li>item2</li>
</ul>
------
<p>List1</p>
<ul>
<li>
<p>item1</p>
</li>
<li>
<p>item2</p>
</li>
</ul>
I'm not clear enough on the markdown spec to say which case is actually correct. Certainly other markdown parsers I've tried (though that is not a long list) render it like the first example.
Regardless I'd expect it to be the same between the two. In most markdown editors the trailing space is impossible to see and it can take a long time to track down why some list elements render with increased padding.
See repro steps. Two examples output visually different HTML whereas I feel they should render the same.
No response
No response
No response
No response
micromark 3.1.0
No response
const content = '![](/imgs/i1.png?_a=center&_w=300)'
const html = micromark(content, {
extensions: [gfm()],
htmlExtensions: [gfmHtml()],
});
console.log(html)
<p><img src="/imgs/i1.png?_a=center&_w=300" alt="" /></p>
<p><img src="/imgs/i1.png?_a=center&_w=300" alt="" /></p>
Node v16
pnpm
macOS
Other (please specify in steps to reproduce)
i moved from react-markdown to micromark for the reasons below:
however, since version 1.x, the building target is es2020, there is a lot of const
s, let
s, method shortcuts and more which will definitely break my app...
esm is a trend, and webpack can deal with it by default, however the es6 syntax... i may have to config my babel, and it might cost more time compliing.
so might it be possible to set build target to es5 which is for now the most compatible output? it does not hurt esm module.
part of my tsconfig.json
"target": "es5",
"lib": [
"dom",
"es2015",
"es2017"
],
no..
Issue from react-markdown: remarkjs/react-markdown#812
But potentially the root of the issue could live in the md parser. Below I have linked the repro links and comments from the other issue:
When processing the MD string
***123****456*
<em>
<strong>123</strong>
</em>
<em>456</em>
React markdown renders what seems to be some additional asterisks?
react-markdown
No response
Compare the result of:
This is just 1 word, where the first half is both italicized and bolded, the 2nd half is only italicized.
The MDAST that gets created from unified() => rehypeParse => rehypeRemark looks correct, so to me the issue seems to be either:
No response
No response
No response
No response
3.1.0
https://github.com/chudoklates/micromark-error-demo
Use repo provided above.
Generally, for this error to occur, the parser needs to be run through Webpack in development
mode. There also needs to be an extension which calls effects.consume()
in its syntax before effects.enter()
is called
Actions which are permissible in the production distribution should also be permissible in development mode.
A TypeError is thrown when the code reaches this assertion:
// at the point of error: code: 123, context.events: []
assert(
code === null
? context.events.length === 0 ||
context.events[context.events.length - 1][0] === 'exit'
: context.events[context.events.length - 1][0] === 'enter',
'expected last token to be open'
)
Uncaught TypeError: Cannot read properties of undefined (reading '0')
at Object.consume (create-tokenizer.js:246:52)
at onStart (extensions.js:45:13)
at start (create-tokenizer.js:460:12)
at start (create-tokenizer.js:401:46)
at start (text.js:49:30)
at go (create-tokenizer.js:229:13)
at main (create-tokenizer.js:209:11)
at Object.write (create-tokenizer.js:135:5)
at subcontent (index.js:198:17)
at subtokenize (index.js:90:30)
Node v16
yarn v1
macOS
Webpack
See: https://svelte.dev/repl/982673f97faa457692eb4d7bd51998df?version=3.29.0
Tl; dr: Some languages use different numerals (eg. ١,٢,٣ instead of 1,2,3). Can those numerals also be used to mark lists?
A quick test with babelmark indicated that https://github.com/dotnet/docfx supports this. https://babelmark.github.io/?text=%D9%A1.+%D9%85%D8%B1%D8%AD%D8%A8%D8%A7%0A%D9%A2.+%D8%A8%D8%A7%D9%84%D8%B9%D8%A7%D9%84%D9%85
3.0.5
https://stackblitz.com/edit/node-qr2fly?file=index.js
import { micromark } from "micromark";
import { Parser, HtmlRenderer } from "commonmark";
const reader = new Parser();
const writer = new HtmlRenderer();
const commonmark = (buf) => writer.render(reader.parse(buf));
const content = `<test:what>`;
console.log(micromark(content));
console.log(commonmark(content));
micromark and commonmark should produce the same HTML output
<p><a href="test:what">test:what</a></p>
micromark produces different HTML
<p><a href="">test:what</a></p>
Node v16
npm v7
Linux
No response
micromark-core-commonmark@npm:1.0.4, micromark-extension-gfm-autolink-literal@npm:1.0.2, micromark-extension-gfm-footnote@npm:1.0.2, micromark-extension-gfm-strikethrough@npm:1.0.3, micromark-extension-gfm-table@npm:1.0.4, micromark-extension-gfm-task-list-item@npm:1.0.2
No response
https://unpkg.com/[email protected]/package.json and you can see that uvu
is in the dependencies and not devDeps
uvu should be set in the dev deps so that, when installing any of the packages define here (like micromark-core-commonmark
), uvu
wouldn't be installed to (it's a test runner, not used in the runtime code)
uvu
is listed in the deps so it get's installed
No response
No response
No response
No response
No response
The package.json
published for micromark-util-encode
v1.0.0 contains "types": "index.d.ts"
:
https://unpkg.com/browse/[email protected]/package.json
Yet the index.d.ts
file is not in the files
whitelist:
micromark/packages/micromark-util-encode/package.json
Lines 33 to 35 in efe9c4d
Most likely that is the reason index.d.ts
isn't published:
https://unpkg.com/browse/[email protected]/
I haven't checked if other micromark packages have a similar issue, this is just what I discovered in my particular project:
Types declared in the package.json
should be published.
Types declared in the package.json
are not published.
Node v16
npm v7
macOS
Other (please specify in steps to reproduce)
I'm in the process of migrating my markdown editor to use remark
/micromark
instead of markdown-it
. One of my goals is not to change the formatting style of my users' input files, at least if I can help it.
At the moment, the micromark
tokenizers seem not to record information that might help reconstruct the original input Markdown in cases where Markdown has redundancies:
_
vs *
for emphasis*
vs -
vs +
for unordered lists*
vs -
vs =
for hrule / thematic breaks, as well as the length of the string used to indicate the breakI'm not terribly interested in preserving superfluous whitespace the user might have, but it would be nice to at least preserve their preferences for emphasis / heading / list syntax. For instance, I personally like to use *
for regular lists and +
/-
for pro/con lists, and at the moment there's no way to preserve that information.
Thanks!
#18 fed with https://github.com/remarkjs/remark/blob/8108fe54e04640dda119aad366d70e6edf2602f1/test/fixtures/input/title-attributes.text can trigger a call stack exceeded issue in unravelLinkedTokens
.
These files are pretty large 1mb and around 30k lines a piece, a more minimal example, at 105kb is also included.
It seems the be related to unterminated links, but more research is needed.
var fs = require('fs')
var micromark = require('./index')
// var doc = fs.readFileSync('crash-395a731d55c510f1338b8c9911c159ab56329d18bc3a12a26b826b750d0b1253.txt')
// var doc = fs.readFileSync('crash-4bf6a4882505b11dea88b5e16e6f0d3766252601ae704e42ebe606d270f9f26f.txt')
var doc = fs.readFileSync('crash-7182fa3e89e1b8fb28bda27b6da6b3769f05b1ce68551d96c46acd0931d95004.txt')
var result = micromark(doc)
console.log(result)
crash-7182fa3e89e1b8fb28bda27b6da6b3769f05b1ce68551d96c46acd0931d95004.txt
crash-4bf6a4882505b11dea88b5e16e6f0d3766252601ae704e42ebe606d270f9f26f.txt
crash-395a731d55c510f1338b8c9911c159ab56329d18bc3a12a26b826b750d0b1253.txt
a more minimal example of what may be the same issue ([](
repeated 35k times in a 105kb file)
repeated-unterminated-links.txt
If possible no error, alternatively a better error message could help.
RangeError: Maximum call stack size exceeded
at unravelLinkedTokens (micromark/dist/util/subtokenize.js:1:16585)
at unravelLinkedTokens (micromark/dist/util/subtokenize.js:1:17944)
at unravelLinkedTokens (micromark/dist/util/subtokenize.js:1:17944)
at unravelLinkedTokens (micromark/dist/util/subtokenize.js:1:17944)
at unravelLinkedTokens (micromark/dist/util/subtokenize.js:1:17944)
at unravelLinkedTokens (micromark/dist/util/subtokenize.js:1:17944)
at unravelLinkedTokens (micromark/dist/util/subtokenize.js:1:17944)
at unravelLinkedTokens (micromark/dist/util/subtokenize.js:1:17944)
at unravelLinkedTokens (micromark/dist/util/subtokenize.js:1:17944)
at unravelLinkedTokens (micromark/dist/util/subtokenize.js:1:17944)
While scanning my dependencies I found that micromark NPM packages don't include their actual license file. I believe it would make sense for the micromark NPM packages to include the license since the MIT license requires that it be included in all copies or substantial portions of the Software.
Since there's 22 NPM packages in the repo and they would presumably all use the same license from the root repo directory, I propose adding a release script that copies the license file from the root repo directory into each of the package directories, like this from vue router. I think it would then make sense to allow git to ignore license files in the package directories (but still allow NPM to include them).
It could also be solved by copy-pasting the license into each of the package directories. I think that may not be preferrable due to causing duplicative content in the repo.
micromark is developed jointly with CMSM: Common Markup State Machine, as it’s sometimes easier to make changes in prose.
If you’re interested in micromark, also definitely check out CMSM!
micromark doesn't accept {disable: {null: []}}
as an extension when using TypeScript.
[email protected]
, [email protected]
please check https://github.com/issueset/micromark-disable-typescript-issue
No typescript error. And it's better to have some document in the README.md for this feature.
Type '{ disable: { null: string[]; }; }' is not assignable to type 'SyntaxExtension[]'.
Given that on large markdown files we are dealing with tons (literally, 100k or so) of events, improving performance might be switching from arrays to linked event objects.
Operations on big arrays can be slow, such as #21.
Switching to linked lists adds complexity (while removing it in certain other cases!), but will probably/hopefully improve perf.
We’re already using really fast array methods. And everything is mutating already. Maybe linked lists won’t net a lot.
Say we take:
Do we backtrack to before the blank lines, and check all the tokenisers again (blank line is last probably), or is there a knowledge of what other tokenisers are enabled and can we “eat” every blank line directly?
The trade-off here is that either, with knowledge of other tokens, we can be more performant and scan the buffer fewer times, or that we are more extensible, allowing blank lines to be turned off, or alternative tokenisers from extensions dealing with them?
3.0.0 (via mdast-util-from-markdown
1.0.0)
Please let me know if this issue should be moved to mdast-util-from-markdown
, but I think the bug is somewhere in micromark source :)
https://codesandbox.io/s/naughty-ptolemy-y62bf?file=/src/index.ts
Content.
2. Hello
3. world
2. Hello
3. world
2. Hello
3. world
Content
1. Hello
2. world
List starting with non-1 numbers are parsed correctly.
Github handles it:
micromark pre-3 also handled it correctly.
List starting with non-1 numbers are not parsed correctly when some paragraph or even empty line is present before them (in container?) 🤷
Content.
2. Hello
3. world
{
"type": "root",
"children": [
{
"type": "paragraph",
"children": [
{
"type": "text",
"value": "Content.",
"position": {
"start": {
"line": 2,
"column": 1,
"offset": 1
},
"end": {
"line": 2,
"column": 9,
"offset": 9
}
}
}
],
"position": {
"start": {
"line": 2,
"column": 1,
"offset": 1
},
"end": {
"line": 2,
"column": 9,
"offset": 9
}
}
},
{
"type": "paragraph",
"children": [
{
"type": "text",
"value": "2. Hello\n3. world",
"position": {
"start": {
"line": 4,
"column": 1,
"offset": 11
},
"end": {
"line": 5,
"column": 9,
"offset": 28
}
}
}
],
"position": {
"start": {
"line": 4,
"column": 1,
"offset": 11
},
"end": {
"line": 5,
"column": 9,
"offset": 28
}
}
}
],
"position": {
"start": {
"line": 1,
"column": 1,
"offset": 0
},
"end": {
"line": 6,
"column": 1,
"offset": 29
}
}
}
I do not think it's build / runtime dependent - it's some construct issue - but it happens both in browser & node - windows & linux.
i'm writing an extension where I would need the definitions defined in the document.
the definitions should be available either via the context this.definitions
or via getData('mediaDefinitions')
.
the latter is probably better.
I could probably overwrite all the definition related enter and exit methods, and track them as well, but this sounds like a wrong approach.
To get more clarity on where this fits in in the @unifiedjs ecosystem, could the assigned folks add some example usages of this library in this issue please?
Examples of how this would be used by @remarkjs and/or @unifiedjs would be helpful as they would clear up the following questions:
remark-parse
is written?processor.use
from the @unifiedjs world?..and any other you folks can come up with.
The idea behind this is to discuss and land on a common understanding of this project's technical goals (e.g., is this a lexer? a parser? I've seen both words around here leading to some confusion), nail the api surface and identify potential extension points. This should help speed up dev, lead to some early "documentation" and prevent misalignment on the goals.
Thanks!
const micromark = require("micromark/lib");
const directive = require("micromark-extension-directive");
micromark( "[!]:)", "utf-8", { extensions: [directive()] });
throws
AssertionError [ERR_ASSERTION]: expected non-empty token (`chunkString`)
├── [email protected]
└── [email protected]
v15.2.1
, npm 7.0.8
run:
const micromark = require("micromark/lib");
const directive = require("micromark-extension-directive");
micromark( "[!]:)", "utf-8", { extensions: [directive()] });
No error, or a more specific markdown syntax related error
AssertionError [ERR_ASSERTION]: expected non-empty token (`chunkString`)
micromark 3.0.10, mdast-util-from-markdown 1.2.0
https://codesandbox.io/s/awesome-elbakyan-c0gus?file=/src/index.ts
If you remove the line between Some HTML
and Spanning multiple lines
it does work but excess whitespace makes the parser confused (it thinks the extra line means a new paragraph starts)
The HTML should be all combined in a single html
node
The parsing fails
I’m going to post a couple of problems I foresee as I’m trying to wrap my head around what micromark will be.
Take the following example:
>␉␠indented.code("in a block quote")
It’s a block quote marker, followed by a tab (tabs are forced to be treated as four spaces).
The first “virtual space” of the tab is part of the block quote marker. The second three “virtual spaces” are part of the indent of the indented code.
One extra real space, and you’ve got a code indent of four spaces, making it a proper indented code, in a block quote.
How is that represented as tokens? In a CST?
There are two main places where parsing is done that is (potentially) useless.
lookaheadConstruct
improves performance by 13%. The alternative should be possible and hopefully is not too big.document
completely improves performance by 28% (although lists are complex so it some time spent there is unavoidable)No response
user@HOST micromark-issue % npm ls micromark
micromark-issue@ /Users/user/micromark-issue
└── [email protected]
user@HOST micromark-issue % cat issue.mjs
import { parse } from "micromark/lib/parse";
import { postprocess } from "micromark/lib/postprocess";
import { preprocess } from "micromark/lib/preprocess";
function repro(markdown) {
console.log("trying...");
const encoding = undefined;
const end = true;
const options = undefined;
const chunks = preprocess()(markdown, encoding, end);
const parseContext = parse(options).document().write(chunks);
const events = postprocess(parseContext);
for (const event of events) {
const [ \_, token, context ] = event;
context.sliceSerialize(token);
}
console.log("ok");
}
repro("Heading\\n=======");
repro("\\nHeading\\n=======");
user@HOST micromark-issue % node issue.mjs
trying...
ok
trying...
file:///Users/user/micromark-issue/node\_modules/micromark/lib/create-tokenizer.js:520
view[0] = view[0].slice(startBufferIndex)
^
TypeError: view[0].slice is not a function
at sliceChunks (file:///Users/user/micromark-issue/node\_modules/micromark/lib/create-tokenizer.js:520:25)
at sliceStream (file:///Users/user/micromark-issue/node\_modules/micromark/lib/create-tokenizer.js:154:12)
at Object.sliceSerialize (file:///Users/user/micromark-issue/node\_modules/micromark/lib/create-tokenizer.js:149:28)
at repro (file:///Users/user/micromark-issue/issue.mjs:15:13)
at file:///Users/user/micromark-issue/issue.mjs:21:1
at ModuleJob.run (node:internal/modules/esm/module\_job:198:25)
at async Promise.all (index 0)
at async ESMLoader.import (node:internal/modules/esm/loader:385:24)
at async loadESM (node:internal/process/esm\_loader:88:5)
at async handleMainPromise (node:internal/modules/run\_main:61:12)
user@HOST micromark-issue %
sliceSerialize
should always be safe to call in a manner like the above and should return a meaningful string. The presence of a leading \n
in Markdown (for example) should not need to be guarded against by library users.
Exception, see above
Node v16
npm v7
macOS
Other (please specify in steps to reproduce)
With the old remark parser, link references that didn't have a corresponding definition were nonetheless detected and converted to mdast.
for example, the following:
> [!NOTE]
> This is a note. Who'd have noted?
used to generate:
{
"type": "root",
"children": [
{
"type": "blockquote",
"children": [
{
"type": "paragraph",
"children": [
{
"type": "linkReference",
"identifier": "!note",
"label": "!NOTE",
"referenceType": "shortcut",
"children": [
{
"type": "text",
"value": "!NOTE"
}
]
},
{
"type": "text",
"value": "\nThis is a note. Who'd have noted?"
}
]
}
]
}
]
}
With micromark, the linkReference
is not inserted, if there is no corresponding definition and a plain paragraph is genererated:
{
"type": "root",
"children": [
{
"type": "blockquote",
"children": [
{
"type": "paragraph",
"children": [
{
"type": "text",
"value": "[!NOTE]\nThis is a note. Who’d have noted?"
}
]
}
],
}
],
}
It should still generate a linkReference
node in the mdast, so that the client of the mdast can decide how to handle a missing definition.
the parser ignores the link reference if no definition is defined.
4.0.0
https://codesandbox.io/s/trusting-star-wv879z?file=/src/index.mjs
I've created a minimal reproduction of the issue here:
I'd expect the string to be compiled into a paragraph with the hyphen to be at the start of the 2nd line.
The string is compiled into a level 2 heading
No response
No response
No response
No response
Input (Japanese) :
console.log(micromark("1. **新規アプリの追加(NEW APP)**を選択します。"));
Output - incorrect:
<ol>
<li>**新規アプリの追加(NEW APP)**を選択します。</li>
</ol>
Emphasis should be parsed correctly
<ol>
<li><strong>新規アプリの追加(NEW APP)</strong>を選択します。</li>
</ol>
The same text in English and Chinese
Input (English):
console.log(micromark("1. Select **NEW APP** (top-left corner)"));
Output - correct:
<ol>
<li>Select <strong>NEW APP</strong> (top-left corner)</li>
</ol>
Input (Chinese):
console.log(micromark("1. 选择**添加应用**(左上角)"));
Output - correct:
<ol>
<li>选择<strong>添加应用</strong>(左上角)</li>
</ol>
This bug appeared when I switched to [email protected] from [email protected].
The next code works correctly:
import unified from "unified";
import markdown from "remark-parse"; // 8.0.3
import rehype from "remark-rehype"; // 8.0.0
import stringify from "rehype-stringify"; // 8.0.0
unified()
.use(markdown)
.use(rehype)
.use(stringify)
.process("1. __新規アプリの追加(NEW APP)__を選択します。", function(err, file) {
console.log(String(file));
});
Reduce coupling and footprint of minified file by using anylogger
i.s.o debug
Currently, this library has a dependency on debug
. Though that is an excellent library, this dependency has 2 major drawbacks:
debug
onto all developers that use this library (high coupling)debug
is 3.1kB minified and gzipped, directly adding 3.1kB to the minimum footprint of this libraryPlease have a look at anylogger
. It's a logging facade specifically designed for libraries. It achieves these goals:
The decoupling is achieved by only including the minimal facade to allow client code to do logging and using adapters to back that facade with an actual logging framework. The minimal footprint follows naturally from this decoupling as the bulk of the code lives in the adapter.
There are already adapters for some popular logging frameworks and more adapters can easily be created:
anylogger-console
(to use the console i.s.o some logging framework)anylogger-debug
anylogger-loglevel
anylogger-log4js
ulog
(logger with native anylogger support)If this library were to switch to anylogger
, you could still install debug
as a dev-dependency and then require('anylogger-debug')
in your tests to have your tests work exactly as they always did, with debug
as the logging framework, while still decoupling it from debug
for all clients.
Disclaimer:
anylogger
was written by me so I'm self-advertising here. However I do honestly believe it is the best solution in this situation andanylogger
was written specifically to decrease coupling between libraries and logging frameworks because for any large application, devs typically end up with multiple loggers in their application because some libraries depend ondebug
, others onloglevel
, yet others onlog4js
and so on. This hurts bundle size badly as we add multiple kB of logging libraries to it.
3.0.5
https://stackblitz.com/edit/node-aaphim?file=index.js
import { micromark } from "micromark";
import { Parser, HtmlRenderer } from "commonmark";
import rehypeParse from "rehype-parse";
import { unified } from "unified";
import { visit } from "unist-util-visit";
import lodash from "lodash";
const reader = new Parser();
const writer = new HtmlRenderer();
function scrubber(tree) {
visit(tree, function (node) {
node.data = undefined;
node.value = undefined;
node.position = undefined;
});
return tree;
}
const commonmark = (buf) => writer.render(reader.parse(buf));
const content = ``;
const micromarkHtml = micromark(content, {
allowDangerousHtml: true,
allowDangerousProtocol: true,
}).trim();
const commonmarkHtml = commonmark(content).trim();
const micromarkHtmlAst = scrubber(
unified().use(rehypeParse, { fragment: true }).parse(micromarkHtml)
);
const commonmarkHtmlAst = scrubber(
unified().use(rehypeParse, { fragment: true }).parse(commonmarkHtml)
);
console.log("micromark");
console.log(micromarkHtml);
console.log("");
console.log(JSON.stringify(micromarkHtmlAst, null, 4));
console.log("");
console.log("commonmark");
console.log(commonmark(content));
console.log("");
console.log(JSON.stringify(commonmarkHtmlAst, null, 2));
console.log(lodash.isEqual(micromarkHtmlAst, commonmarkHtmlAst));
📓 the character in content
is U+000C
<p></p>
with the structure
{
"type": "root",
"children": [
{
"type": "element",
"tagName": "p",
"properties": {},
"children": []
}
]
}
micromark keeps the space
<p>
</p>
changing the structure of the document
{
"type": "root",
"children": [
{
"type": "element",
"tagName": "p",
"properties": {},
"children": [
{
"type": "text"
}
]
}
]
}
Node v16
npm v7
Linux
No response
3.0.5
https://stackblitz.com/edit/node-njevp4?file=index.js
import { micromark } from 'micromark';
import { Parser, HtmlRenderer } from 'commonmark';
import rehypeParse from 'rehype-parse';
import { unified } from 'unified';
import { visit } from 'unist-util-visit';
import lodash from 'lodash';
const reader = new Parser();
const writer = new HtmlRenderer();
function scrubber(tree) {
visit(tree, function (node) {
node.data = undefined;
node.value = undefined;
node.position = undefined;
});
return tree;
}
const commonmark = (buf) => writer.render(reader.parse(buf));
const content = `example*�.*example example**`;
const micromarkHtml = micromark(content, {
allowDangerousHtml: true,
allowDangerousProtocol: true,
}).trim();
const commonmarkHtml = commonmark(content).trim();
const micromarkHtmlAst = scrubber(
unified().use(rehypeParse, { fragment: true }).parse(micromarkHtml)
);
const commonmarkHtmlAst = scrubber(
unified().use(rehypeParse, { fragment: true }).parse(commonmarkHtml)
);
console.log('micromark');
console.log(micromarkHtml);
console.log('');
console.log(JSON.stringify(micromarkHtmlAst, null, 4));
console.log('');
console.log('commonmark');
console.log(commonmark(content));
console.log('');
console.log(JSON.stringify(commonmarkHtmlAst, null, 2));
console.log(lodash.isEqual(micromarkHtmlAst, commonmarkHtmlAst));
single emphasis in the document
<p>example*�.<em>example example</em>*</p>
with the HTML structure
{
"type": "root",
"children": [
{
"type": "element",
"tagName": "p",
"properties": {},
"children": [
{
"type": "text"
},
{
"type": "element",
"tagName": "em",
"properties": {},
"children": [
{
"type": "text"
}
]
},
{
"type": "text"
}
]
}
]
}
extra emphasis is added
<p>example<em>�.<em>example example</em></em></p>
changing the structure
{
"type": "root",
"children": [
{
"type": "element",
"tagName": "p",
"properties": {},
"children": [
{
"type": "text"
},
{
"type": "element",
"tagName": "em",
"properties": {},
"children": [
{
"type": "text"
},
{
"type": "element",
"tagName": "em",
"properties": {},
"children": [
{
"type": "text"
}
]
}
]
}
]
}
]
}
Node v16
npm v7
Linux
No response
using node 18.17.1
No response
I am using botframework-webchat and when i try to build it, the below error message pop-ups.
Error - [webpack] 'dist':
./node_modules/micromark-util-decode-numeric-character-reference/index.js 23:11
Module parse failed: Identifier directly after number (23:11)
You may need an appropriate loader to handle this file type, currently no loaders are configured to process this file. See https://webpack.js.org/concepts#loaders
| code > 126 && code < 160 ||
| // Lone high surrogates and low surrogates.
code > 55_295 && code < 57_344 ||
| // Noncharacters.
| code > 64_975 && code < 65_008 || /* eslint-disable no-bitwise */
@ ./node_modules/mdast-util-from-markdown/lib/index.js 138:0-97 1061:14-45
@ ./node_modules/mdast-util-from-markdown/index.js
@ ./node_modules/botframework-webchat/lib/markdown/private/iterateLinkDefinitions.js
@ ./node_modules/botframework-webchat/lib/markdown/renderMarkdown.js
@ ./node_modules/botframework-webchat/lib/index.js
@ ./lib/extensions/chatbotExtension/renderer/Chatbot.js
@ ./lib/extensions/chatbotExtension/renderer/ChatbotPanel.js
@ ./lib/extensions/chatbotExtension/ChatbotExtensionApplicationCustomizer.js
./node_modules/micromark-util-sanitize-uri/index.js 86:22
the package should build sucessfully
currently its giving error while running npm build
Node v16
npm v7
Windows
Webpack
micormark
https://codesandbox.io/s/thirsty-fire-1jdcgn
parse this:
# Trailing hard-break
This break is properly detected\
yes?
But a trailing break is not\
What's worse, it leaves a stray `\`
checking the github behaviour, it's the same :-( but of course this is rather unfortunate and is difficult to find a workaround.
<h1>Trailing hard-break</h1>
<p>This break is properly detected<br />
yes?</p>
<p>But a trailing break is not<br /></p>
<p>What's worse, it leaves a stray <code>\</code></p>
<h1>Trailing hard-break</h1>
<p>This break is properly detected<br />
yes?</p>
<p>But a trailing break is not\</p>
<p>What's worse, it leaves a stray <code>\</code></p>
Node v14
npm v7
macOS
No response
I have been wishing to write a (simple and lightweight) spec‐compliant editor for Markdown with syntax highlighting for a while now.
Now that this library has become usable (and it seems to be the first of its kind), I have finally gotten an opportunity to write a simple editor with it! (Thank you! 🎉)
Unfortunately, there appears to be a bug in the library! The issue I’m running into is that emphases marked with ***
(both regular and strong) have their tokens misnested.
I have written a simple program to demonstrate what I mean:
import parser from "https://dev.jspm.io/[email protected]/lib/parse.js"
import preprocessor from "https://dev.jspm.io/[email protected]/lib/preprocess.js"
import postprocessor from "https://dev.jspm.io/[email protected]/lib/postprocess.js"
let preprocess = txt =>
{
let write = preprocessor()
return [...write(txt), ...write(null)]
}
let parse = text => postprocessor()(preprocess(text).flatMap(parser().document().write))
let tokens = parse("hello ***world***")
tokens.pop()
let output = ""
let i = 0
let offset
for (let [kind, {type, start, end}] of tokens)
{
let char = "→"
if (kind === "enter") offset = start.offset
else offset = end.offset, i--, char = "←"
output += `${" ".repeat(i*3) + char} ${type} at ${offset}\n`
if (kind === "enter") i++
}
console.log(output)
(Note: I’m using dev.jspm.io
for now, as opposed to jspm.dev
, because jspm.dev
bundles the whole library into its index file, as opposed to separating it into multiple files. See more info on jspm.dev
’s announcement post)
Currently, the output is the following:
→ content at 0
→ paragraph at 0
→ data at 0
← data at 5
→ data at 5
← data at 6
→ emphasis at 8
→ emphasisSequence at 8
← emphasisSequence at 9
→ emphasisText at 9
→ strong at 6
→ strongSequence at 6
← strongSequence at 8
→ strongText at 8
→ data at 9
← data at 14
← strongText at 15
→ strongSequence at 15
← strongSequence at 17
← strong at 17
← emphasisText at 14
→ emphasisSequence at 14
← emphasisSequence at 15
← emphasis at 15
← paragraph at 17
← content at 17
As you can see, when moving from → emphasisText at 9
to → strong at 6
(as well as in other places), the indices go down, which is unexpected. This causes my highlighter to break! 😱
Thanks in advance for the attention!
Markdown consists of blocks and inlines. Blocks are parsed per line.
Typically, at a certain point in a line, you know you’re right: take this ATX heading:
###### A heading
When standing on the space, you know you’re in a heading: it can’t be anything else. So ATX headings don’t really need to buffer a lot: at most 6 characters.
Other values, need more, like this link definition:
[take]:
https://this-link-definition
'asd
> block quote?
asd
asd
asd
asd
Only at the last character, the line feed without a closing title marker before it, do you know you need to backtrack, and parse the whole thing again. And it isn’t all a paragraph either, take for example the embedded > block quote?
An alternative example that needs to buffer infinity lines is indented code:
␠␠␠␠this is a chunk (a properly indented non-blank line)
␠␠␠
␠␠
␠
␠␠
␠␠␠
␠␠␠␠
␠␠␠␠␠
␠␠␠␠
␠␠␠
␠␠
␠
␠␠␠
<-- And only here do we know the blank lines are not part of the indented code. Note that the line endings, and more that four spaces in a blank line, still show up in the code, so if we had another chunk, all the above line endings and that one extra space would be there.
🤔 So how far does one buffer? These are edge cases, not common in normal Markdown. But it could be interesting to see if we can cap this to reduce a potential memory problem.
3.0.8
No response
I'm using Micromark in an astro project, and ever since installing micromark 3.0.8 I get this error:
[15:04:05] [snowpack] + [email protected]
[build] Unable to render src/pages/renew/checkout.astro
ReferenceError: document is not defined
While this might be snowpack being finnicky, [email protected] works totally fine! So just wondering if anything has been introduced which could cause it.
My project builds:
[15:04:05] [snowpack] + [email protected]
[build] Unable to render src/pages/renew/checkout.astro
ReferenceError: document is not defined
Node v16
yarn v1
macOS
Snowpack
Split code in several packages, use export maps and condition
Given that:
codes.greaterThan
instead of the actual character code) and optimized production code, that is currently split into dist/
or lib/
respectivelytokenize/factory-
, or the tiny things in lib/util/
) are useful in micromark extensions (or inverted: many of the extensions currently use the micromark’s internals)I propose:
micromark/micromark
a monorepo that houses a couple of projectsmicromark-factory-*
as a namespace in the ecosystem for factories: some housed in the monorepo (how to parse a label), some in their own repos in this org (how micromark-extension-directive
parses HTML attributes or micromark-extension-expression
parses JavaScript), yet some others in the ecosystemmicromark-core-character
would expose all the ascii*
, unicode*
, and markdown*
functions currently in micromark/lib/character
micromark-core-constant
would expose codes
, constants
, values
, types
, html-block-names
, html-raw-names
prod/
folder which houses a micromark extension/factory/core, and builds a dev/
folder from it, copying types, inlining constants, and removing assertionsdevelopment
/ production
/ default
(same as prod I guess)?
see
micromark/test/io/text/image.mjs
Line 152 in 63cf514
I don't know if this is really a bad thing, but the new behaviour of micromark is to generate an empty string for the alt attribute, where as the old remark-parser used to set the alt property of the mdast node to null
.
there is a slight distinction, such as an image with an empty alt text is considered a decorative image and should be ignored by a screen reader. if the alt attribute is missing, it will just read the src
(not a brilliant behaviour, either :-)
https://www.w3.org/WAI/tutorials/images/decorative/
In any case, the new behaviour allows the author to specify decorative images in markdown by default, which wasn't possible before.
not sure. But for backward compatibility's sake: A markdown image w/o an alt text should not create an alt attribute in HTML (mdast node's property should be null
)
No response
try to import micromark-util-symbol
in typescript
no type error reporting
get error: Cannot find module 'micromark-util-symbol' or its corresponding type declarations.ts(2307)
add a field in package.json:
"types": "./lib/default.d.ts",
No response
No response
No response
No response
latest main branch
No response
I ran a profile of micromark
and noticed TokenizeContext.now
was something like the 4th most time-consuming function. It’s quite a simple function as all it does is return a copy of point
. I tried a couple of alternate implementations and found one that reduces runtime by ~11% on an Apple Silicon M1 (mac Mini). I imagine the delta is different on other hardware, so maybe other folks can give this a try on their hardware? That said, I expect this change will be more efficient everywhere because it avoids a call into Object.assign
and tells the JIT exactly what needs to be done. All tests pass with this change; you can see the code change and minimal test harness here: main...DavidAnson:micromark:TokenizeContext-now.
I added a scenario to perf.js
that reads the content of readme.md
and calls micromark
500 times. This input seems fairly representative, but I’m happy if folks want to profile on something else. The numbers below are pretty stable, so I only took three samples before/after.
The 3 readings I did before changing anything: ((17.726 + 17.786 + 17.606) / 3) = 17.706s
The 3 readings I did after making the change: ((15.75 + 15.676 + 15.688) / 3) = 15.705s
By my math, the time eliminated is: ((17.706 - 15.705) / 17.706) = 0.1130 = 11.30%
To be sure, the alternate implementation I propose here violates the encapsulation of Point
- but a simple test case could be added to ensure any future changes to Point
are accomodated.
I can send a proper PR if folks are open to this change.
N/A
N/A
Node v16
npm v7
macOS
No response
micromark 4.0.0, micromark-extension-gfm-autolink-literal 2.0.0
No response
Consistent treatment of [email protected]
by autolink and literalAutolink.
[email protected]
and <[email protected]>
are both emitted as literalAutolink. Expected behavior is observed for <[email protected]>
which is emitted as autolink.
This is significant for a linter which can be confused by the current behavior into adding infinite <>
wrappers attempting to turn [email protected]
from literalAutolink into autolink: DavidAnson/markdownlint#1140
I propose that <[email protected]>
should be treated as autolink, which is seemingly possible if emailAtSignOrDot
behaved differently:
micromark/packages/micromark-core-commonmark/dev/lib/autolink.js
Lines 203 to 205 in 8b08d16
The micromark tokens (when using micromark-extension-gfm-autolink-literal) for parsing the above Markdown are:
content [email protected]
paragraph [email protected]
literalAutolink [email protected]
literalAutolinkEmail [email protected]
lineEnding \n
lineEndingBlank \n
content <[email protected]>
paragraph <[email protected]>
data <
literalAutolink [email protected]
literalAutolinkEmail [email protected]
data >
lineEnding \n
lineEndingBlank \n
content <[email protected]>
paragraph <[email protected]>
autolink <[email protected]>
autolinkMarker <
autolinkEmail [email protected]
autolinkMarker >
lineEnding \n
Node v16
npm v6
macOS
Webpack
I want to have several paragraphs like this:
I am a paragraph.
I am part of the same paragraph.
But I am a new paragraph.
This is compiled to the following:
I am a paragraph. I am part of the same paragraph. But I am a new paragraph.
I'd expect the following result:
I am a paragraph. I am part of the same paragraph.
But I am a new paragraph.
I could use the <p>
tag manually in the Markdown.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.