yuin / goldmark Goto Github PK
View Code? Open in Web Editor NEW:trophy: A markdown parser written in Go. Easy to extend, standard(CommonMark) compliant, well structured.
License: MIT License
:trophy: A markdown parser written in Go. Easy to extend, standard(CommonMark) compliant, well structured.
License: MIT License
Running Hugo 0.60.1 with Goldmark 1.1.8. I'm basing this off the PHP Markdown Extra spec since CommonMark doesn't support footnotes, so please bear with me.
From my reading of the PHP Markdown Extra spec for footnotes, they should always be parsed a and numbered in sequential order in the document. What I've noticed in Goldmark is that they're parsed using the text in the footnote name, or that this sequential renumbering step is skipped - I'm not entirely sure which.
This[^3] is[^1] text with footnotes[^2].
[^1]: Footnote one
[^2]: Footnote two
[^3]: Footnote three
This1 is2 text with footnotes3
1: Footnote three
2: Footnote one
3: Footnote two
This3 is1 text with footnotes2
1: Footnote one
2: Footnote two
3: Footnote three
(I apologize, my first example did not reflect the actual problem. I have updated it.)
I've now merged in Goldmark as the default Markdown handler in Hugo and it works great.
I have set unsafe=false
as the default, and that works mostly as expected.
But the rendering of external links comes as a surprise on most people, I think.
[Google Search!](https://google.com/)
=>
[Google Search!](https://google.com/)
So, the security motivation behind the above is maybe to prevent fake linking? But when the end result is that most people configure it to be unsafe just to get proper links, I think that makes the net security much less.
According to https://github.com/yuin/goldmark/blob/master/.github/workflows/test.yaml,
GitHub Actions are using 1.12.x
tests that make sense if the library is compatible with 1.12 which is not by definition in go.mod
.
On the other hand the only problem for not being compatible with 1.12
is unsigned shift operations. Is it then worth converting those loops causing trouble to using uint
instead?
Test:
https://github.com/mironovalexey/gm-test/blob/master/line/run.go
Result:
<p>Single <code>line</code></p>
Given:
This is a line! Yes.
And this is another!
Will got:
Paragraph {
RawText: "This is a line! Yes."
HasBlankPreviousLines: false
Text: "This is a line"
Text: "! Yes."
}
Paragraph {
RawText: "And this is another!"
HasBlankPreviousLines: false
Text: "And this is another"
Text: "!"
}
Expected:
Paragraph {
RawText: "This is a line! Yes."
HasBlankPreviousLines: false
Text: "This is a line! Yes."
}
Paragraph {
RawText: "And this is another!"
HasBlankPreviousLines: false
Text: "And this is another!"
}
Please answer the following before submitting your issue:
v1.1.9
go1.13.4
linux/amd64
make fuzz
boredom
sh$ cat fuzz/crashers/db13717bee8cb87337140ed44b4f9bc01214e3fb.quoted
">*\t>\n> \t0\n>\t\t0\n>0"
sh$ cat fuzz/crashers/db13717bee8cb87337140ed44b4f9bc01214e3fb.output
panic: interface conversion: ast.Node is *ast.CodeBlock, not *ast.ListItem
goroutine 1 [running]:
github.com/yuin/goldmark/parser.lastOffset(0x688820, 0xc000190630, 0x1)
/go/src/github.com/yuin/goldmark/parser/list.go:102 +0xfd
github.com/yuin/goldmark/parser.(*listParser).Continue(0x7e2060, 0x688820, 0xc000190630, 0x687780, 0xc0001e97a0, 0x687840, 0xc0001e9880, 0xa)
/go/src/github.com/yuin/goldmark/parser/list.go:192 +0x24e
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc0001f2000, 0x688040, 0xc00001ec00, 0x687780, 0xc0001e97a0, 0x687840, 0xc0001e9880)
/go/src/github.com/yuin/goldmark/parser/parser.go:1032 +0x558
github.com/yuin/goldmark/parser.(*parser).Parse(0xc0001f2000, 0x687780, 0xc0001e97a0, 0x0, 0x0, 0x0, 0x30, 0x632200)
/go/src/github.com/yuin/goldmark/parser/parser.go:818 +0x1eb
github.com/yuin/goldmark.(*markdown).Convert(0xc000078b00, 0x7ff83a71d000, 0x11, 0x11, 0x6849e0, 0xc0000959b0, 0x0, 0x0, 0x0, 0x3a2334ec, ...)
/go/src/github.com/yuin/goldmark/markdown.go:116 +0xac
github.com/yuin/goldmark/fuzz.Fuzz(0x7ff83a71d000, 0x11, 0x11, 0x3)
/go/src/github.com/yuin/goldmark/fuzz/fuzz.go:23 +0x269
go-fuzz-dep.Main(0xc00029bf48, 0x1, 0x1)
go-fuzz-dep/main.go:36 +0x1ad
main.main()
github.com/yuin/goldmark/fuzz/go.fuzz.main/main.go:15 +0x52
exit status 2
How should be french typographic rules applied, through an extension, or is it something that is dependendant of the Go language itself? Or another Go Package? SmartyPants but with more rules specific to a language.
For instance, languages like PHP have libs to handle this https://github.com/jolicode/JoliTypo
Some of the french typographic rules are liste by Grammalecte Firefox extension:
Goldmark 1.1.8 implementation only takes into account one-byte code point (ASCII) while generating auto heading IDs, simply discarding extended latin characters (2 bytes) and other international characters (3 bytes).
https://github.com/yuin/goldmark/blob/master/parser/parser.go#L83-L85
In multilingual sites, this causes imperfect heading IDs to be generated.
Tested with Goldmark 1.16.
package main
import (
"bytes"
"fmt"
"log"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/extension"
)
func main() {
convert(`Foo|Bar
---|---
` + "`" + `Yoyo` + "`" + `|Dyne`)
}
func convert(src string) {
markdown := goldmark.New(
goldmark.WithExtensions(
extension.Table,
),
)
var buf bytes.Buffer
err := markdown.Convert([]byte(src), &buf)
if err != nil {
log.Fatal(err)
}
fmt.Println(buf.String())
}
Produces
<table>
<thead>
<tr>
<th>Foo</th>
<th>Bar</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>Yoyo</code>|Dyne</td>
<td></td>
</tr>
</tbody>
</table>
The "try it" on https://commonmark.org/help/tutorial/02-emphasis.html renders it correclty, https://spec.commonmark.org/dingus/ renders nothing.
func (*testTransformer) Transform(node *ast.Document, reader text.Reader, pc parser.Context) {
processNodes(node)
}
func processNodes(n ast.Node) {
if n.Kind() == ast.KindHeading {
if p := n.Parent(); p != nil {
p.RemoveChild(p, n)
}
return
}
for c := n.FirstChild(); c != nil; c = n.NextSibling() {
processNodes(c)
}
}
Source markdown:
# Header 1
text
## Header 2
text
Result:
<p>text</p>
<h2>Header 2</h2>
<p>text</p>
package main
import (
"bytes"
"fmt"
"log"
"runtime/debug"
"github.com/yuin/goldmark/parser"
"github.com/yuin/goldmark"
)
func main() {
convert(`#
# FOO`)
}
func convert(src string) {
defer func() {
if r := recover(); r != nil {
fmt.Println("Panic:\n", string(debug.Stack()))
}
}()
markdown := goldmark.New(
goldmark.WithParserOptions(
parser.WithAutoHeadingID(),
),
)
var buf bytes.Buffer
err := markdown.Convert([]byte(src), &buf)
if err != nil {
log.Fatal(err)
}
}
github.com/yuin/goldmark/text.(*Segments).At(...)
/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/text/segment.go:182
github.com/yuin/goldmark/parser.generateAutoHeadingID(0xc0001ba000, 0x1229380, 0xc0001aa7e0, 0x1229440, 0xc0001aa850)
/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/atx_heading.go:190 +0x219
github.com/yuin/goldmark/parser.(*atxHeadingParser).Close(0xc00012120e, 0x122a0c0, 0xc0001ba000, 0x1229380, 0xc0001aa7e0, 0x1229440, 0xc0001aa850)
/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/atx_heading.go:173 +0xba
github.com/yuin/goldmark/parser.(*parser).closeBlocks(0xc0001b1500, 0x0, 0x0, 0x1229380, 0xc0001aa7e0, 0x1229440, 0xc0001aa850)
/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/parser.go:845 +0x162
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc0001b1500, 0x1229c40, 0xc000126780, 0x1229380, 0xc0001aa7e0, 0x1229440, 0xc0001aa850)
/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/parser.go:1058 +0x753
github.com/yuin/goldmark/parser.(*parser).Parse(0xc0001b1500, 0x1229380, 0xc0001aa7e0, 0x0, 0x0, 0x0, 0x8, 0x8)
/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/parser.go:818 +0x148
github.com/yuin/goldmark.(*markdown).Convert(0xc000123a40, 0xc000121230, 0x7, 0x8, 0x1226bc0, 0xc00009a9f0, 0x0, 0x0, 0x0, 0xc0001ae000, ...)
/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/markdown.go:116 +0x94
main.convert(0x11f960a, 0x7)
/Users/bep/dev/go/bep/temp/main.go:33 +0x1d5
main.main()
/Users/bep/dev/go/bep/temp/main.go:16 +0x36
package main
import (
"bytes"
"fmt"
"log"
"github.com/yuin/goldmark"
)
func main() {
content := `**Bold:**Regular`
markdown := goldmark.New()
var buf bytes.Buffer
err := markdown.Convert([]byte(content), &buf)
if err != nil {
log.Fatal(err)
}
fmt.Println(buf.String())
}
Prints:
<p>**Bold:**Regular</p>
It seems that parser.WithAttribute() pocessinп does not always work well.
https://github.com/mironovalexey/gm-test/tree/master/hattrs
https://github.com/mironovalexey/gm-test/blob/master/hattrs/test.md
Using this file:
package main
import (
"bytes"
gm "github.com/yuin/goldmark"
"github.com/yuin/goldmark/extension"
)
func main() {
var s1 = []byte("https://github.com?q=stars:>1")
var s2 bytes.Buffer
gm.New(gm.WithExtensions(extension.Linkify)).Convert(s1, &s2)
print(s2.String())
}
I get this result:
<p><a href="https://github.com?q=stars:">https://github.com?q=stars:</a>>1</p>
With github.com parser, I get this result:
<p><a href="https://github.com?q=stars:%3E1">https://github.com?q=stars:>1</a></p>
Example:
Please answer the following before submitting your issue:
This is a feature request. Both Pandoc and Black Friday support a different version of footnotes than the one currently supported by Goldmark. Pandoc refers to them as inline footnotes. The syntax looks like this:
This is a sentence.^[This is footnote one.] This is also a sentence.^[This will become footnote two.]
Would it be possible for Goldmark to support this kind of footnote?
It seems that there should be one more builtin goldmark extension - yaml metadata block. Then real GFM is fully supported.
goldmark does not linkify following links:
搜索引擎链接https://www.google.com
OR
搜索引擎链接:https://www.google.com
First of all, thanks a lot for the work on goldmark. I just tried it with the new release and it works great. Though, there is one minor imperfection:
The unsafe
option is turned on and there is html code inside a paragraph, like this:
This is **Bold** <span>Component</span><template>
<div>Name</div>
</template> **Bold** as well.
This will render as:
<p>This is <strong>Bold</strong> <span>Component</span><template></p>
<div>Name</div>
</template> **Bold** as well.
Notice how the closing tag </p>
is set too early. If I remove the line break after <template>
it works as aspected:
This is **Bold** <span>Component</span><template> <div>Name</div>
</template> **Bold** as well.
<p>This is <strong>Bold</strong> <span>Component</span><template> <div>Name</div>
</template> <strong>Bold</strong> as well.</p>
Of course I can just move the div up, but there are other divs in my template as well (they also call </p>
too early) and therefore this one line will become quite long and hard to maintain. Basically, the template tag and everything inside should not call for the automatic setting of </p>
.
Even though I use Hugo to render, I think this is a goldmark related issue.
Please answer the following before submitting your issue:
v1.1.10
go1.13.4
linux/amd64
make fuzz
boredom
sh$ cat fuzz/crashers/db0b78ba444c6efd83f1d4f6f74faab82aaf3cb5.quoted
"{\n-"
sh$ cat fuzz/crashers/db0b78ba444c6efd83f1d4f6f74faab82aaf3cb5.output
panic: runtime error: index out of range [0] with length 0
goroutine 1 [running]:
github.com/yuin/goldmark/parser.parseAttribute(0x688e40, 0xc0001bd650, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7f33d7475000)
/go/src/github.com/yuin/goldmark/parser/attribute.go:102 +0xa9b
github.com/yuin/goldmark/parser.ParseAttributes(0x688e40, 0xc0001bd650, 0x0, 0x1, 0x0, 0x1)
/go/src/github.com/yuin/goldmark/parser/attribute.go:61 +0x1e2
github.com/yuin/goldmark/parser.parseLastLineAttributes(0x689b80, 0xc00016a090, 0x688e40, 0xc0001bd500, 0x688f00, 0xc0001bd5e0)
/go/src/github.com/yuin/goldmark/parser/atx_heading.go:229 +0x429
github.com/yuin/goldmark/parser.(*setextHeadingParser).Close(0xc000117c70, 0x689b80, 0xc00016a090, 0x688e40, 0xc0001bd500, 0x688f00, 0xc0001bd5e0)
/go/src/github.com/yuin/goldmark/parser/setext_headings.go:107 +0x56e
github.com/yuin/goldmark/parser.(*parser).closeBlocks(0xc000197500, 0x0, 0x0, 0x688e40, 0xc0001bd500, 0x688f00, 0xc0001bd5e0)
/go/src/github.com/yuin/goldmark/parser/parser.go:845 +0x199
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc000197500, 0x689700, 0xc00001e980, 0x688e40, 0xc0001bd500, 0x688f00, 0xc0001bd5e0)
/go/src/github.com/yuin/goldmark/parser/parser.go:1023 +0xc12
github.com/yuin/goldmark/parser.(*parser).Parse(0xc000197500, 0x688e40, 0xc0001bd500, 0x0, 0x0, 0x0, 0x30, 0x633700)
/go/src/github.com/yuin/goldmark/parser/parser.go:818 +0x1eb
github.com/yuin/goldmark.(*markdown).Convert(0xc000119e80, 0x7f33d7475000, 0x3, 0x3, 0x685fa0, 0xc00007cff0, 0x0, 0x0, 0x0, 0x9, ...)
/go/src/github.com/yuin/goldmark/markdown.go:116 +0xac
github.com/yuin/goldmark/fuzz.Fuzz(0x7f33d7475000, 0x3, 0x3, 0x4)
/go/src/github.com/yuin/goldmark/fuzz/fuzz.go:34 +0x43c
go-fuzz-dep.Main(0xc00026ff48, 0x1, 0x1)
go-fuzz-dep/main.go:36 +0x1ad
main.main()
github.com/yuin/goldmark/fuzz/go.fuzz.main/main.go:15 +0x52
package main
import (
"bytes"
"fmt"
"log"
"runtime/debug"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/extension"
)
func main() {
convert(`
Go to [http://www.example.com](www.example.com) or http://www.example.com.
`)
}
func convert(src string) {
markdown := goldmark.New(
goldmark.WithExtensions(extension.Linkify),
)
var buf bytes.Buffer
err := markdown.Convert([]byte(src), &buf)
if err != nil {
log.Fatal(err)
}
fmt.Println(buf.String())
}
Prints:
<p>Go to <a href="www.example.com"><a href="http://www.example.com">http://www.example.com</a></a> or <a href="http://www.example.com">http://www.example.com</a>.</p>
Please answer the following before submitting your issue:
v1.1.9
go1.13.4
linux/amd64
make fuzz
boredom
sh$ cat fuzz/crashers/374f2bf4f9cd8bb2d4737a8bcb30f74ea5ef9e10.quoted
"[^000]:0\t[^]:"
sh$ cat fuzz/crashers/374f2bf4f9cd8bb2d4737a8bcb30f74ea5ef9e10.output
panic: runtime error: slice bounds out of range [:14] with capacity 13
goroutine 1 [running]:
github.com/yuin/goldmark/text.(*Segment).Value(0xc00026edc0, 0x7f2e89c6c000, 0xd, 0xd, 0x7f2e89c6c009, 0x0, 0x0)
/go/src/github.com/yuin/goldmark/text/segment.go:44 +0x33f
github.com/yuin/goldmark/text.(*reader).Value(0xc0001bd7a0, 0xe, 0xe, 0x0, 0x0, 0xd, 0x3)
/go/src/github.com/yuin/goldmark/text/reader.go:106 +0x62
github.com/yuin/goldmark/extension.(*footnoteBlockParser).Open(0x7e2060, 0x689480, 0xc000125f40, 0x687780, 0xc0001bd7a0, 0x687840, 0xc0001bd880, 0x1, 0xc000125f40, 0x8)
/go/src/github.com/yuin/goldmark/extension/footnote.go:55 +0x294
github.com/yuin/goldmark/parser.(*parser).openBlocks(0xc0001d4000, 0x689480, 0xc000125f40, 0x0, 0x687780, 0xc0001bd7a0, 0x687840, 0xc0001bd880, 0x2)
/go/src/github.com/yuin/goldmark/parser/parser.go:908 +0x481
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc0001d4000, 0x688040, 0xc00001ef80, 0x687780, 0xc0001bd7a0, 0x687840, 0xc0001bd880)
/go/src/github.com/yuin/goldmark/parser/parser.go:1008 +0x218
github.com/yuin/goldmark/parser.(*parser).Parse(0xc0001d4000, 0x687780, 0xc0001bd7a0, 0x0, 0x0, 0x0, 0x30, 0x632200)
/go/src/github.com/yuin/goldmark/parser/parser.go:818 +0x1eb
github.com/yuin/goldmark.(*markdown).Convert(0xc0001c88c0, 0x7f2e89c6c000, 0xd, 0xd, 0x6849e0, 0xc00007d9b0, 0x0, 0x0, 0x0, 0x24f90bed, ...)
/go/src/github.com/yuin/goldmark/markdown.go:116 +0xac
github.com/yuin/goldmark/fuzz.Fuzz(0x7f2e89c6c000, 0xd, 0xd, 0x3)
/go/src/github.com/yuin/goldmark/fuzz/fuzz.go:23 +0x269
go-fuzz-dep.Main(0xc00026ff48, 0x1, 0x1)
go-fuzz-dep/main.go:36 +0x1ad
main.main()
github.com/yuin/goldmark/fuzz/go.fuzz.main/main.go:15 +0x52
exit status 2
Thank you for the excellent markdown processor. This is really very impressive.
Could you please help me to understand the logic of BlockParser.
For example, I want to implement the behaviour of (nesting) lists/blockquotes with help of markers, i.e
%START%
Content 1
%START%
Content2
%FINISH%
Content 3
%FINISH%
should produce
<START>
Content 1
<START>
Content 2
</FINISH>
Content 3
</FINISH>
But when I write something like this, I am a little bit confused. It interrupts the parsing along with the first parser.Close call. But blockquote works well and there could be multiple parser.Close calls during the nested blockquotes parsing cycle.
Please add Release Notes to your releases!
You may borrow my script/tools that I use for this tools/release.sh
This will help a lot for folks watching releases and not all commits :)
Ref this interface:
// A Markdown interface offers functions to convert Markdown text to
// a desired format.
type Markdown interface {
// Convert interprets a UTF-8 bytes source in Markdown and write rendered
// contents to a writer w.
Convert(source []byte, writer io.Writer, opts ...parser.ParseOption) error
// Parser returns a Parser that will be used for conversion.
Parser() parser.Parser
// SetParser sets a Parser to this object.
SetParser(parser.Parser)
// Parser returns a Renderer that will be used for conversion.
Renderer() renderer.Renderer
// SetRenderer sets a Renderer to this object.
SetRenderer(renderer.Renderer)
}
With the above, I can create a Markdown
with a custom parser
and renderer
(I'm not sure what the setters are for) and then run Convert
to do the job.
A big win (ref. your benchmarks) when you have this strict separation between parse and render, is to parse once and render to every format you need. I don't see how that is possible with the current API?
goldmark v1.1.7
with Hugo 0.60.0
As mentioned in gohugoio/hugo/issues/6551 Goldmark seems to not support footnote return links although they are supported by PHP Markdown extra.
It would be great if Goldmark supported them.
Thank you very much in advance.
Markdown:
That's some text with a footnote.[^1]
[^1]: And that's the footnote.
Output:
…
<section class="footnotes" role="doc-endnotes"><hr><ol><li id="fn:1" role="doc-endnote"><p>And that's the footnote.</p></li></ol></section>
Hi @yuin
I'm trying to append class="..."
to all img tags and wondering if something like this would make sense to add (of course it's simply a hard-coded example for "class" attribute only) :
This way users wouldn't have to completely rewrite render function in case something simple as adding a class is needed and they don't want to possibly break the code when the library gets updated.
This is my use case:
case *ast.Image:
if entering {
n.SetAttributeString("class", "img-fluid")
}
Hello
Member of the Hugo team here. Currently testing Goldmark as the new default in Hugo 0.60.0 DEV.
Apparently hard line breaks as specified in Commonmark 0.29 are not rendered by Golmark for markdown files with Windows-style line endings.
In a collaborative project that I maintain files can be edited by other team members on Windows.
Typically we use two spaces for a line break.
But I only managed to render the line break after using dos2unix
to convert the line endings from DOS to UNIX like so: dos2unix some-file.md
.
cc: @bep
Sample:
*[TAB]A
[space][space][space][space]B
The source below is taken from the README.
package main
import (
"bytes"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/parser"
)
func main() {
md := goldmark.New(
goldmark.WithParserOptions(
parser.WithAttribute(),
),
)
source := []byte(`
## heading {#id .className attrName=attrValue class="class1 class2"}
`)
var buf bytes.Buffer
if err := md.Convert(source, &buf); err != nil {
panic(err)
}
}
Panics:
panic: interface conversion: interface {} is [][]uint8, not []uint8
goroutine 1 [running]:
github.com/yuin/goldmark/renderer/html.(*Renderer).RenderAttributes(0xc00009cae0, 0x12262c0, 0xc000119d80, 0x1227860, 0xc0001b2000)
/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/renderer/html/html.go:513 +0x227
github.com/yuin/goldmark/renderer/html.(*Renderer).renderHeading(0xc00009cae0, 0x12262c0, 0xc000119d80, 0xc000184140, 0x49, 0x49, 0x1227860, 0xc0001b2000, 0x2001, 0x8, ...)
/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/renderer/html/html.go:208 +0x11f
github.com/yuin/goldmark/renderer.(*renderer).Render.func2(0x1227860, 0xc0001b2000, 0x1, 0x0, 0x12262c0, 0xc000119d80)
/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/renderer/renderer.go:167 +0x108
github.com/yuin/goldmark/ast.Walk(0x1227860, 0xc0001b2000, 0xc000175e30, 0x3, 0x0)
/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/ast/ast.go:433 +0x43
github.com/yuin/goldmark/ast.Walk(0x12273e0, 0xc00011c780, 0xc000175e30, 0xc0001b8000, 0x0)
/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/ast/ast.go:439 +0x149
github.com/yuin/goldmark/renderer.(*renderer).Render(0xc0001245f0, 0x12243c0, 0xc0000909f0, 0xc000184140, 0x49, 0x49, 0x12273e0, 0xc00011c780, 0xc000124501, 0xc0000909f0)
/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/renderer/renderer.go:162 +0x13c
github.com/yuin/goldmark.(*markdown).Convert(0xc000119a00, 0xc000184140, 0x49, 0x49, 0x12243c0, 0xc0000909f0, 0x0, 0x0, 0x0, 0xc000064058, ...)
/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/markdown.go:117 +0xe3
See https://github.com/bep/markdown-benchmarks
I borrowed your tests and tried to make them as similar as possible + added some more.
Feel free to grab the code if you want.
GoldMark is doing well.
Using this file:
package main
import (
"bytes"
gm "github.com/yuin/goldmark"
"github.com/yuin/goldmark/extension"
)
func main() {
var s1 = []byte("https://github.com/sunday's")
var s2 bytes.Buffer
gm.New(gm.WithExtensions(extension.Linkify)).Convert(s1, &s2)
print(s2.String())
}
I get this result:
<p><a href="https://github.com/sunday">https://github.com/sunday</a>'s</p>
With github.com parser, I get this result:
<p><a href="https://github.com/sunday's">https://github.com/sunday's</a></p>
Example:
Hi there,
Thanks for spending the time to make this! This is super useful, and the extensibility is a great feature not easily found elsewhere. I had one issue, I'm not sure if this is a bug or a side effect, but here it goes. In fenced code blocks, it appears that the last backtick escapes.
So, for example:
```
function lorem(ipsum, dolor = 1) {
const sit = ipsum == null ? 0 : ipsum.sit;
dolor = sit - amet(dolor);
return sit ? consectetur(ipsum, 0, dolor < 0 ? 0 : dolor) : [];
}
function adipiscing(...elit) {
if (!elit.sit) {
return [];
}
const sed = elit[0];
return eiusmod.tempor(sed) ? sed : [sed];
}
function incididunt(ipsum, ut = 1) {
ut = labore.et(amet(ut), 0);
const sit = ipsum == null ? 0 : ipsum.sit;
if (!sit || ut < 1) {
return [];
}
let dolore = 0;
let magna = 0;
const aliqua = new eiusmod(labore.ut(sit / ut));
while (dolore < sit) {
aliqua[magna++] = consectetur(ipsum, dolore, (dolore += ut));
}
return aliqua;
}
```
Ends up being rendered as:
——————————————————————————————
function lorem(ipsum, dolor = 1) {
const sit = ipsum == null ? 0 : ipsum.sit;
dolor = sit - amet(dolor);
return sit ? consectetur(ipsum, 0, dolor < 0 ? 0 : dolor) : [];
}
function adipiscing(...elit) {
if (!elit.sit) {
return [];
}
const sed = elit[0];
return eiusmod.tempor(sed) ? sed : [sed];
}
function incididunt(ipsum, ut = 1) {
ut = labore.et(amet(ut), 0);
const sit = ipsum == null ? 0 : ipsum.sit;
if (!sit || ut < 1) {
return [];
}
let dolore = 0;
let magna = 0;
const aliqua = new eiusmod(labore.ut(sit / ut));
while (dolore < sit) {
aliqua[magna++] = consectetur(ipsum, dolore, (dolore += ut));
}
return aliqua;
}
`
——————————————————————————————
^ superfluous last backtick
This is the actual code fragment that is generated by above:
<pre style="color:#93a1a1;background-color:#002b36"><span style="color:#268bd2">function</span> lorem(ipsum, dolor <span style="color:#719e07">=</span> <span style="color:#2aa198">1</span>) {
<span style="color:#268bd2">const</span> sit <span style="color:#719e07">=</span> ipsum <span style="color:#719e07">==</span> <span style="color:#cb4b16">null</span> <span style="color:#719e07">?</span> <span style="color:#2aa198">0</span> <span style="color:#719e07">:</span> ipsum.sit;
dolor <span style="color:#719e07">=</span> sit <span style="color:#719e07">-</span> amet(dolor);
<span style="color:#719e07">return</span> sit <span style="color:#719e07">?</span> consectetur(ipsum, <span style="color:#2aa198">0</span>, dolor <span style="color:#719e07"><</span> <span style="color:#2aa198">0</span> <span style="color:#719e07">?</span> <span style="color:#2aa198">0</span> <span style="color:#719e07">:</span> dolor) <span style="color:#719e07">:</span> [];
}
<span style="color:#268bd2">function</span> adipiscing(...elit) {
<span style="color:#719e07">if</span> (<span style="color:#719e07">!</span>elit.sit) {
<span style="color:#719e07">return</span> [];
}
<span style="color:#268bd2">const</span> sed <span style="color:#719e07">=</span> elit[<span style="color:#2aa198">0</span>];
<span style="color:#719e07">return</span> eiusmod.tempor(sed) <span style="color:#719e07">?</span> sed <span style="color:#719e07">:</span> [sed];
}
<span style="color:#268bd2">function</span> incididunt(ipsum, ut <span style="color:#719e07">=</span> <span style="color:#2aa198">1</span>) {
ut <span style="color:#719e07">=</span> labore.et(amet(ut), <span style="color:#2aa198">0</span>);
<span style="color:#268bd2">const</span> sit <span style="color:#719e07">=</span> ipsum <span style="color:#719e07">==</span> <span style="color:#cb4b16">null</span> <span style="color:#719e07">?</span> <span style="color:#2aa198">0</span> <span style="color:#719e07">:</span> ipsum.sit;
<span style="color:#719e07">if</span> (<span style="color:#719e07">!</span>sit <span style="color:#719e07">||</span> ut <span style="color:#719e07"><</span> <span style="color:#2aa198">1</span>) {
<span style="color:#719e07">return</span> [];
}
<span style="color:#268bd2">let</span> dolore <span style="color:#719e07">=</span> <span style="color:#2aa198">0</span>;
<span style="color:#268bd2">let</span> magna <span style="color:#719e07">=</span> <span style="color:#2aa198">0</span>;
<span style="color:#268bd2">const</span> aliqua <span style="color:#719e07">=</span> <span style="color:#719e07">new</span> eiusmod(labore.ut(sit <span style="color:#719e07">/</span> ut));
<span style="color:#719e07">while</span> (dolore <span style="color:#719e07"><</span> sit) {
aliqua[magna<span style="color:#719e07">++</span>] <span style="color:#719e07">=</span> consectetur(ipsum, dolore, (dolore <span style="color:#719e07">+=</span> ut));
}
<span style="color:#719e07">return</span> aliqua;
}
</pre><p>`</p>
Thanks for taking a look!
Hello,
For background and related discussion, please see the following post in Hugo forum.
Please answer the following before submitting your issue:
command-gen-instance
and v1-0-0-apr-21-2019
)commandgeninstance
and v100-april-21-2019
)Many thanks for your work with Goldmark.
For the following text:
I'm going to see my mother. She's very nice.
Currently, '
is not converted to ’
for contractions when the typography extension is enabled, but smartypants does. I would expect the output to be:
I’m going to see my mother. She’s very nice.
This is a follow up to #37
So, setting state on the nodes in the AST and then use that while rendering works, but ...
What I'm now doing instead is something ala:
w := renderContext{
BufWriter: bufio.NewWriter(buf),
renderContextData: renderContextDataHolder{
rctx: ctx,
dctx: c.ctx,
},
}
if err := c.md.Renderer().Render(w, ctx.Src, doc); err != nil {
return nil, err
}
This works great , and I don't mind doing it like this (this is entirely internal), but the down side is that it may stop working in the future if you decide to wrap the writer or something.
func (*test) Transform(node *ast.Document, reader text.Reader, pc parser.Context) {
walk(node)
}
func walk(node ast.Node) {
for n := node.FirstChild(); n != nil; n = node.NextSibling() {
walk(n)
}
}
test![^1]
[^1]: footnote
<p>test![^1]</p>
This happens if an exclamation mark is placed before the footnote link.
Using this file:
package main
import (
"bytes"
gm "github.com/yuin/goldmark"
"github.com/yuin/goldmark/extension"
)
func main() {
var s1 = []byte("https://github.com#sun,mon")
var s2 bytes.Buffer
gm.New(gm.WithExtensions(extension.Linkify)).Convert(s1, &s2)
print(s2.String())
}
I get this result:
<p><a href="https://github.com#sun">https://github.com#sun</a>,mon</p>
With github.com parser, I get this result:
<p><a href="https://github.com#sun,mon">https://github.com#sun,mon</a></p>
Example:
Hey!
I am trying to "markdownify" input coming from an HTML textarea, and it contains carriage returns.
Using a fenced code block with carriage returns cause the whole program to panic with a slice bounds out of range
error.
Here is an example:
package main
import (
"bytes"
"fmt"
"html/template"
"github.com/yuin/goldmark"
)
func main() {
var buf bytes.Buffer
if err := goldmark.Convert([]byte("lol\r\n\r\n```\r\nok\r\n```\r\n\r\nyes"), &buf); err != nil {
panic(err)
}
fmt.Printf("%v", template.HTML(buf.String()))
}
panic: runtime error: slice bounds out of range
goroutine 1 [running]:
github.com/yuin/goldmark/parser.(*fencedCodeBlockParser).Open(0x8336e0, 0x6b0340, 0xc000102600, 0x6afa80, 0xc00009a7e0, 0x6af880, 0xc00009a8a0, 0x0, 0x0, 0x10)
/home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/parser/fcode_block.go:51 +0x419
github.com/yuin/goldmark/parser.(*parser).openBlocks(0xc000172000, 0x6b0340, 0xc000102600, 0x1, 0x6afa80, 0xc00009a7e0, 0x6af880, 0xc00009a8a0, 0x3)
/home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/parser/parser.go:849 +0x27e
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc000172000, 0x6b0340, 0xc000102600, 0x6afa80, 0xc00009a7e0, 0x6af880, 0xc00009a8a0)
/home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/parser/parser.go:923 +0x1ce
github.com/yuin/goldmark/parser.(*parser).Parse(0xc000172000, 0x6afa80, 0xc00009a7e0, 0x0, 0x0, 0x0, 0x20, 0x62e2c0)
/home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/parser/parser.go:771 +0x157
github.com/yuin/goldmark.(*markdown).Convert(0xc00017a000, 0xc0000b6a00, 0x1a, 0x1a, 0x6ab8a0, 0xc00007ad20, 0x0, 0x0, 0x0, 0x0, ...)
/home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/markdown.go:116 +0x94
github.com/yuin/goldmark.Convert(...)
/home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/markdown.go:31
main.main()
/home/thomas/Proj/blobstash/cmd/lol/p.go:13 +0xbc
exit status 2
Thanks!
Goldmark's CommonMark compatibility is amazing and with attribute support, the mathjax extension, and the metadata extension covers what I feel are the most popular parts of Pandoc Markdown. It seems very possible that Goldmark could eventually replace external pandoc
dependencies in many Go applications today. I would very much like to contribute toward that goal and I'm aware of others who would be also.
To that end I am seeking some design direction and consensus about how to move forward.
Extensions for each feature seems most reasonable. But I opened this issue to make sure a full AST Transformer might not be a better approach. Personally I prefer the modularity of an extension for each --- particularly Pandoc's unique Simplified Tables --- and find the composition design valuable that Pandoc has used for its internals.
Which design direction is most recommended for such work? Several extensions or a single Transformer? I'm almost sure the answer is extensions but am asking anyway to avoid something I may have missed.
Thank you. (If there is a better place to have this discussion please let me know.)
I'm in the process of creating some link/image extensions that would allow for link resolution/image resize etc.
For that to work, I need to pass on some document state to the custom link renderer. But I don't see how.
The Parse
method can take a context, but I don't see a similar way to pass a struct via Render
. I could create a new goldmark.Markdown
for each document, but that sounds wasteful.
In working on adding this to Hugo, I wanted to implement ToC in a general way that we could possibly also use for other things; e.g. a "content map" with byte slice pointers (start/stop) into the rendered content.
I experimented by creating an extension:
https://github.com/bep/hugo/blob/goldmark2/markup/goldmark/contentmap.go#L35
But that doesn't work, as I notice that you pick up the first renderer for a given node kind.
Note that for the ToC thing (which is what Hugo has today), I can traverse the AST and build the ToC from that, but it would be really useful if could somehow register the rendered start/stop position for the different blocks; so people could do things like:
Again, thanks for this library, it's really easy to use.
Test case: https://github.com/mironovalexey/gm-test (eof package).
Thank you for great work on this library! I've been looking for this clean implementation that works out-of-box without any patches.
Now for a project I'm working on I need a little bit of hacking of default renderer. Would you please direct me as to where I would plug my custom rendering of a youtube auto-links. Basically I'm having them as *ast.AutoLink
nodes. Now I'm trying to rewrite the rendering of those so that they appear in <div>
s with the <img>
of the youtube preview picture and <a>
leading to the video. That's the idea.
So far I've been able to declare a custom type:
// CustomGoldmarkRenderer renders specific markdown documents containing video links
type CustomGoldmarkRenderer struct {
defaultRenderer renderer.Renderer
file *[]byte
}
which then I make implementing the Renderer interface:
func (c CustomGoldmarkRenderer) Render(w io.Writer, source []byte, n ast.Node) error {
ast.Walk(n, func(n ast.Node, entering bool) (status ast.WalkStatus, err error) {
switch t := n.(type) {
case *ast.AutoLink:
url := string(t.URL(*c.file))
matches := youTubeLinkRegex.FindAllStringSubmatch(url, -1)
if len(matches) == 0 {
// Or try a short link
matches = youTubeShortLinkRegex.FindAllStringSubmatch(url, -1)
}
if len(matches) > 0 {
videoID := matches[0][1] // Group 1 stands for the first (...) block
if entering {
fmt.Fprintf(w, `
<div class="py-2 col-12 col-xl-3 col-lg-3 col-md-4 mb-2">
<a href="%s" target="_blank" class="d-block h-180">
<img class="img-fluid img-thumbnail rounded" src="%s" alt="%s"/>
</a>
%s`,
url,
"https://img.youtube.com/vi/"+videoID+"/mqdefault.jpg",
"title",
"titleHTML")
return ast.WalkSkipChildren, nil
} else {
fmt.Fprintf(w, `</div>`)
}
}
}
return ast.WalkContinue, nil
})
return c.defaultRenderer.Render(w, source, n)
}
so that I'm able to plug my custom renderer into the md := goldmark.New(...)
as following:
md.SetRenderer(CustomGoldmarkRenderer{
defaultRenderer: md.Renderer(),
file: &file, // Passing original source so that it becomes available in parsing function
})
var buf bytes.Buffer
if err := md.Convert(file, &buf); err != nil {
panic(err)
}
Now of course what I get is simply my additional rendering of <div>
s that I do with my walker, and then (with no surprise) the ordinary rendering is being appended to the io.Writer when return c.defaultRenderer.Render(w, source, n)
comes into play.
Being an amateur coder in Go I can't figure out how to render the rest of the nodes with the default way while I do the rendering in my custom ast.Walk(...)
call, node by node, since return c.defaultRenderer.Render(w, source, n)
seems to be called just once for the Document node and doesn't really help me with individual nodes at all.
So, Would you be so kind to hint me where I'm wrong and which direction I would rather need to choose?
After using goldmard process the markdown text **「刻舟求剑」**
, the result is still **「刻舟求剑」**
, but the expected is 「刻舟求剑」.
**
https://github.com/mironovalexey/gm-test/tree/master/hcomments
https://github.com/mironovalexey/gm-test/blob/master/hcomments/test.md
Any line between --- and --- breaks the processing.
Other parsers allow for bare autolinks. For example:
http://example.com
returns:
<a href="http://example.com">http://example.com</a>
When span-level element contain new lines, its content is not treated as markdown.
Test case: https://github.com/mironovalexey/gm-test (html package).
https://github.com/mironovalexey/gm-test/blob/master/html/test.md
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.