yuin / goldmark Goto Github PK

View Code? Open in Web Editor NEW

3.4K 3.4K 238.0 1.98 MB

:trophy: A markdown parser written in Go. Easy to extend, standard(CommonMark) compliant, well structured.

License: MIT License

Makefile 0.28% Go 99.43% C 0.29%

commonmark go golang markdown

goldmark's People

Contributors

Stargazers

Watchers

Forkers

shammishailaj nschonni qsdj rjc antboard litao91 shnwang backwardn jabingp isgasho ksharpdabu mitghi anatofuz evgenyk freightprotocol moorereason tchigher johnyhi elinvention anthonyfok jkboxomine legacy-tech-repos seemethere artyom gaoyoubo zzwx-forks woshizilong banyue dut3062796s thinklib sosiska zeripath mfrank2016 qiuzhiqian tangiel sjml tv42 ganly nobonobo jschaf mikesbrown pzl dolanor-galaxy clearcodecn cipherboy mewbak silentchen mbrukman qianzy96 forkkit codem-code movsb khun84 twilightbook pgavlin yaaax jaydenwen123 showsmall ahmedalhulaibi daniel-007 gtrevg evankanderson chi07 abhi15sep bracketsoftware helfper onthegit andscoop linuxerwang klaven stepanstipl abijr tawawhite sts0mrg0 smalchi andymeneely fastgh inyono steadbytes matti hooligani tamudashe markcol karelbilek jangocheng admco-github-com blackclimber mayocream kokizzu didik78 sdirix eltociear reyadussalahin muharihar jinze stephenafamo zgtxxxx smarteng vforks wlevene

goldmark's Issues

Footnotes numbering should always be sequential

Running Hugo 0.60.1 with Goldmark 1.1.8. I'm basing this off the PHP Markdown Extra spec since CommonMark doesn't support footnotes, so please bear with me.

From my reading of the PHP Markdown Extra spec for footnotes, they should always be parsed a and numbered in sequential order in the document. What I've noticed in Goldmark is that they're parsed using the text in the footnote name, or that this sequential renumbering step is skipped - I'm not entirely sure which.

What did you do? : Added footnotes to text, e.g.:

This[^3] is[^1] text with footnotes[^2].

[^1]: Footnote one
[^2]: Footnote two
[^3]: Footnote three

What did you expect to see? :

This¹ is² text with footnotes³
¹: Footnote three
²: Footnote one
³: Footnote two

What did you see instead? :

This³ is¹ text with footnotes²
¹: Footnote one
²: Footnote two
³: Footnote three

(I apologize, my first example did not reflect the actual problem. I have updated it.)

Rendering of external links in safe mode

I've now merged in Goldmark as the default Markdown handler in Hugo and it works great.

I have set unsafe=false as the default, and that works mostly as expected.

But the rendering of external links comes as a surprise on most people, I think.

[Google Search!](https://google.com/)

[Google Search!](https://google.com/)

So, the security motivation behind the above is maybe to prevent fake linking? But when the end result is that most people configure it to be unsafe just to get proper links, I think that makes the net security much less.

gohugoio/hugoThemesSite#67

Remove the 1.12.x tests or fix the library to conform to unsigned shifts change since 1.13

According to https://github.com/yuin/goldmark/blob/master/.github/workflows/test.yaml,
GitHub Actions are using 1.12.x tests that make sense if the library is compatible with 1.12 which is not by definition in go.mod.

On the other hand the only problem for not being compatible with 1.12 is unsigned shift operations. Is it then worth converting those loops causing trouble to using uint instead?

Single line is treated as a parargraph

Test:
https://github.com/mironovalexey/gm-test/blob/master/line/run.go

Result:

<p>Single <code>line</code></p>

"!" will always start a new text element.

Given:

This is a line! Yes.

And this is another!

Will got:

    Paragraph {
        RawText: "This is a line! Yes."
        HasBlankPreviousLines: false
        Text: "This is a line"
        Text: "! Yes."
    }
    Paragraph {
        RawText: "And this is another!"
        HasBlankPreviousLines: false
        Text: "And this is another"
        Text: "!"
    }

Expected:

    Paragraph {
        RawText: "This is a line! Yes."
        HasBlankPreviousLines: false
        Text: "This is a line! Yes."
    }
    Paragraph {
        RawText: "And this is another!"
        HasBlankPreviousLines: false
        Text: "And this is another!"
    }

Fuzz crash on ">*\t>\n> \t0\n>\t\t0\n>0"

Please answer the following before submitting your issue:

What version of goldmark are you using? : v1.1.9
What version of Go are you using? : go1.13.4
What operating system and processor architecture are you using? : linux/amd64
What did you do? : make fuzz
What did you expect to see? : boredom
What did you see instead? :

sh$ cat fuzz/crashers/db13717bee8cb87337140ed44b4f9bc01214e3fb.quoted
        ">*\t>\n> \t0\n>\t\t0\n>0"

sh$ cat fuzz/crashers/db13717bee8cb87337140ed44b4f9bc01214e3fb.output
panic: interface conversion: ast.Node is *ast.CodeBlock, not *ast.ListItem

goroutine 1 [running]:
github.com/yuin/goldmark/parser.lastOffset(0x688820, 0xc000190630, 0x1)
        /go/src/github.com/yuin/goldmark/parser/list.go:102 +0xfd
github.com/yuin/goldmark/parser.(*listParser).Continue(0x7e2060, 0x688820, 0xc000190630, 0x687780, 0xc0001e97a0, 0x687840, 0xc0001e9880, 0xa)
        /go/src/github.com/yuin/goldmark/parser/list.go:192 +0x24e
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc0001f2000, 0x688040, 0xc00001ec00, 0x687780, 0xc0001e97a0, 0x687840, 0xc0001e9880)
        /go/src/github.com/yuin/goldmark/parser/parser.go:1032 +0x558
github.com/yuin/goldmark/parser.(*parser).Parse(0xc0001f2000, 0x687780, 0xc0001e97a0, 0x0, 0x0, 0x0, 0x30, 0x632200)
        /go/src/github.com/yuin/goldmark/parser/parser.go:818 +0x1eb
github.com/yuin/goldmark.(*markdown).Convert(0xc000078b00, 0x7ff83a71d000, 0x11, 0x11, 0x6849e0, 0xc0000959b0, 0x0, 0x0, 0x0, 0x3a2334ec, ...)
        /go/src/github.com/yuin/goldmark/markdown.go:116 +0xac
github.com/yuin/goldmark/fuzz.Fuzz(0x7ff83a71d000, 0x11, 0x11, 0x3)
        /go/src/github.com/yuin/goldmark/fuzz/fuzz.go:23 +0x269
go-fuzz-dep.Main(0xc00029bf48, 0x1, 0x1)
        go-fuzz-dep/main.go:36 +0x1ad
main.main()
        github.com/yuin/goldmark/fuzz/go.fuzz.main/main.go:15 +0x52
exit status 2

How to apply microtypographic rules to Markdown?

What version of goldmark are you using? : v1.11.1 (Hugo 0.60.1)
What version of Go are you using? : go1.13.4
What operating system and processor architecture are you using? darwin/amd64 (macOS 10.15.1)
What did you do? : Write Markdown in french language
What did you expect to see? : french typographic rules applied (like inserting a non-breakable space before a question mark)
What did you see instead? : no french typographic rules
(Feature request only): Why you can not implement it as an extension?: Not a Go programmer

How should be french typographic rules applied, through an extension, or is it something that is dependendant of the Go language itself? Or another Go Package? SmartyPants but with more rules specific to a language.

For instance, languages like PHP have libs to handle this https://github.com/jolicode/JoliTypo

Some of the french typographic rules are liste by Grammalecte Firefox extension:

Extended unicode characters discarded from auto heading IDs

Goldmark 1.1.8 implementation only takes into account one-byte code point (ASCII) while generating auto heading IDs, simply discarding extended latin characters (2 bytes) and other international characters (3 bytes).

https://github.com/yuin/goldmark/blob/master/parser/parser.go#L83-L85

In multilingual sites, this causes imperfect heading IDs to be generated.

table extension: Merged table columns

Tested with Goldmark 1.16.

package main

import (
	"bytes"
	"fmt"
	"log"

	"github.com/yuin/goldmark"
	"github.com/yuin/goldmark/extension"
)

func main() {

	convert(`Foo|Bar
---|---
` + "`" + `Yoyo` + "`" + `|Dyne`)
}

func convert(src string) {

	markdown := goldmark.New(
		goldmark.WithExtensions(
			extension.Table,
		),
	)
	var buf bytes.Buffer
	err := markdown.Convert([]byte(src), &buf)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(buf.String())
}

Produces

<table>
<thead>
<tr>
<th>Foo</th>
<th>Bar</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>Yoyo</code>|Dyne</td>
<td></td>
</tr>
</tbody>
</table>

The "try it" on https://commonmark.org/help/tutorial/02-emphasis.html renders it correclty, https://spec.commonmark.org/dingus/ renders nothing.

gohugoio/hugo#6641

How to remove all nodes with NodeType in ASTTransformer?

func (*testTransformer) Transform(node *ast.Document, reader text.Reader, pc parser.Context) {
    processNodes(node)
}

func processNodes(n ast.Node) {
    if n.Kind() == ast.KindHeading {
        if p := n.Parent(); p != nil {
            p.RemoveChild(p, n)
        }
        return
    }
    for c := n.FirstChild(); c != nil; c = n.NextSibling() {
        processNodes(c)
    }
}

Source markdown:

# Header 1

text

## Header 2

text

Result:

<p>text</p>
<h2>Header 2</h2>
<p>text</p>

Panic in auto-id

package main

import (
	"bytes"
	"fmt"
	"log"
	"runtime/debug"

	"github.com/yuin/goldmark/parser"

	"github.com/yuin/goldmark"
)

func main() {

	convert(`#
# FOO`)
}

func convert(src string) {
	defer func() {
		if r := recover(); r != nil {
			fmt.Println("Panic:\n", string(debug.Stack()))
		}
	}()

	markdown := goldmark.New(
		goldmark.WithParserOptions(
			parser.WithAutoHeadingID(),
		),
	)
	var buf bytes.Buffer
	err := markdown.Convert([]byte(src), &buf)
	if err != nil {
		log.Fatal(err)
	}
}

github.com/yuin/goldmark/text.(*Segments).At(...)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/text/segment.go:182
github.com/yuin/goldmark/parser.generateAutoHeadingID(0xc0001ba000, 0x1229380, 0xc0001aa7e0, 0x1229440, 0xc0001aa850)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/atx_heading.go:190 +0x219
github.com/yuin/goldmark/parser.(*atxHeadingParser).Close(0xc00012120e, 0x122a0c0, 0xc0001ba000, 0x1229380, 0xc0001aa7e0, 0x1229440, 0xc0001aa850)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/atx_heading.go:173 +0xba
github.com/yuin/goldmark/parser.(*parser).closeBlocks(0xc0001b1500, 0x0, 0x0, 0x1229380, 0xc0001aa7e0, 0x1229440, 0xc0001aa850)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/parser.go:845 +0x162
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc0001b1500, 0x1229c40, 0xc000126780, 0x1229380, 0xc0001aa7e0, 0x1229440, 0xc0001aa850)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/parser.go:1058 +0x753
github.com/yuin/goldmark/parser.(*parser).Parse(0xc0001b1500, 0x1229380, 0xc0001aa7e0, 0x0, 0x0, 0x0, 0x8, 0x8)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/parser.go:818 +0x148
github.com/yuin/goldmark.(*markdown).Convert(0xc000123a40, 0xc000121230, 0x7, 0x8, 0x1226bc0, 0xc00009a9f0, 0x0, 0x0, 0x0, 0xc0001ae000, ...)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/markdown.go:116 +0x94
main.convert(0x11f960a, 0x7)
	/Users/bep/dev/go/bep/temp/main.go:33 +0x1d5
main.main()
	/Users/bep/dev/go/bep/temp/main.go:16 +0x36

Colon inside ** breaks "boldness"

package main

import (
	"bytes"
	"fmt"
	"log"

	"github.com/yuin/goldmark"
)

func main() {
	content := `**Bold:**Regular`

	markdown := goldmark.New()

	var buf bytes.Buffer
	err := markdown.Convert([]byte(content), &buf)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(buf.String())
}

Prints:

<p>**Bold:**Regular</p>

Header attributes

It seems that parser.WithAttribute() pocessinп does not always work well.

https://github.com/mironovalexey/gm-test/tree/master/hattrs

https://github.com/mironovalexey/gm-test/blob/master/hattrs/test.md

Greater-than sign breaks Linkify

Using this file:

package main
import (
   "bytes"
   gm "github.com/yuin/goldmark"
   "github.com/yuin/goldmark/extension"
)
func main() {   
   var s1 = []byte("https://github.com?q=stars:>1")
   var s2 bytes.Buffer
   gm.New(gm.WithExtensions(extension.Linkify)).Convert(s1, &s2)
   print(s2.String())
}

I get this result:

<p><a href="https://github.com?q=stars:">https://github.com?q=stars:</a>&gt;1</p>

With github.com parser, I get this result:

<p><a href="https://github.com?q=stars:%3E1">https://github.com?q=stars:&gt;1</a></p>

Example:

https://github.com?q=stars:>1

Support for inline footnotes

goldmark is fully compliant with the CommonMark. Before submitting issue, you must read CommonMark spec and confirm your output is different from CommonMark online demo.

Please answer the following before submitting your issue:

What version of goldmark are you using? : v1.1.10 via Hugo v0.60.1
What version of Go are you using? : go1.13.4
What operating system and processor architecture are you using? : darwin/amd64

This is a feature request. Both Pandoc and Black Friday support a different version of footnotes than the one currently supported by Goldmark. Pandoc refers to them as inline footnotes. The syntax looks like this:

This is a sentence.^[This is footnote one.] This is also a sentence.^[This will become footnote two.]

Would it be possible for Goldmark to support this kind of footnote?

slice bounds out of range on Windows

After 4536e57
This file crashes: https://github.com/gohugoio/hugo/blob/master/hugolib/testdata/what-is-markdown.md

Linkify does not work after Chinese characters

goldmark does not linkify following links:

搜索引擎链接https://www.google.com

搜索引擎链接：https://www.google.com

What version of goldmark are you using? : v1.1.11
What version of Go are you using? : 1.12
What operating system and processor architecture are you using? : Hugo v0.60.1 on macOS
What did you do? : Put a link after Chinese characters
What did you expect to see? : The link should be automatically created
What did you see instead? : The link was not automatically created
(Feature request only): Why you can not implement it as an extension?: not applicable

Unwanted paragraph closing tag in html template tag

First of all, thanks a lot for the work on goldmark. I just tried it with the new release and it works great. Though, there is one minor imperfection:

The unsafe option is turned on and there is html code inside a paragraph, like this:

This is **Bold** <span>Component</span><template>
<div>Name</div>
</template>  **Bold** as well.

This will render as:

<p>This is <strong>Bold</strong> <span>Component</span><template></p>
<div>Name</div>
</template>  **Bold** as well.

Notice how the closing tag </p> is set too early. If I remove the line break after <template> it works as aspected:

This is **Bold** <span>Component</span><template> <div>Name</div>
</template>  **Bold** as well.

<p>This is <strong>Bold</strong> <span>Component</span><template> <div>Name</div>
</template>  <strong>Bold</strong> as well.</p>

Of course I can just move the div up, but there are other divs in my template as well (they also call </p> too early) and therefore this one line will become quite long and hard to maintain. Basically, the template tag and everything inside should not call for the automatic setting of </p>.

Even though I use Hugo to render, I think this is a goldmark related issue.

Fuzz crasher in parser/attribute.go:102

Please answer the following before submitting your issue:

What version of goldmark are you using? : v1.1.10
What version of Go are you using? : go1.13.4
What operating system and processor architecture are you using? : linux/amd64
What did you do? : Merge #54 and then make fuzz
What did you expect to see? : boredom
What did you see instead? :

sh$ cat fuzz/crashers/db0b78ba444c6efd83f1d4f6f74faab82aaf3cb5.quoted
        "{\n-"

sh$ cat fuzz/crashers/db0b78ba444c6efd83f1d4f6f74faab82aaf3cb5.output
panic: runtime error: index out of range [0] with length 0

goroutine 1 [running]:
github.com/yuin/goldmark/parser.parseAttribute(0x688e40, 0xc0001bd650, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7f33d7475000)
        /go/src/github.com/yuin/goldmark/parser/attribute.go:102 +0xa9b
github.com/yuin/goldmark/parser.ParseAttributes(0x688e40, 0xc0001bd650, 0x0, 0x1, 0x0, 0x1)
        /go/src/github.com/yuin/goldmark/parser/attribute.go:61 +0x1e2
github.com/yuin/goldmark/parser.parseLastLineAttributes(0x689b80, 0xc00016a090, 0x688e40, 0xc0001bd500, 0x688f00, 0xc0001bd5e0)
        /go/src/github.com/yuin/goldmark/parser/atx_heading.go:229 +0x429
github.com/yuin/goldmark/parser.(*setextHeadingParser).Close(0xc000117c70, 0x689b80, 0xc00016a090, 0x688e40, 0xc0001bd500, 0x688f00, 0xc0001bd5e0)
        /go/src/github.com/yuin/goldmark/parser/setext_headings.go:107 +0x56e
github.com/yuin/goldmark/parser.(*parser).closeBlocks(0xc000197500, 0x0, 0x0, 0x688e40, 0xc0001bd500, 0x688f00, 0xc0001bd5e0)
        /go/src/github.com/yuin/goldmark/parser/parser.go:845 +0x199
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc000197500, 0x689700, 0xc00001e980, 0x688e40, 0xc0001bd500, 0x688f00, 0xc0001bd5e0)
        /go/src/github.com/yuin/goldmark/parser/parser.go:1023 +0xc12
github.com/yuin/goldmark/parser.(*parser).Parse(0xc000197500, 0x688e40, 0xc0001bd500, 0x0, 0x0, 0x0, 0x30, 0x633700)
        /go/src/github.com/yuin/goldmark/parser/parser.go:818 +0x1eb
github.com/yuin/goldmark.(*markdown).Convert(0xc000119e80, 0x7f33d7475000, 0x3, 0x3, 0x685fa0, 0xc00007cff0, 0x0, 0x0, 0x0, 0x9, ...)
        /go/src/github.com/yuin/goldmark/markdown.go:116 +0xac
github.com/yuin/goldmark/fuzz.Fuzz(0x7f33d7475000, 0x3, 0x3, 0x4)
        /go/src/github.com/yuin/goldmark/fuzz/fuzz.go:34 +0x43c
go-fuzz-dep.Main(0xc00026ff48, 0x1, 0x1)
        go-fuzz-dep/main.go:36 +0x1ad
main.main()
        github.com/yuin/goldmark/fuzz/go.fuzz.main/main.go:15 +0x52

Linkify bug

package main

import (
	"bytes"
	"fmt"
	"log"
	"runtime/debug"

	"github.com/yuin/goldmark"
	"github.com/yuin/goldmark/extension"
)

func main() {

	convert(`
Go to [http://www.example.com](www.example.com) or http://www.example.com.
`)
}

func convert(src string) {
	markdown := goldmark.New(
		goldmark.WithExtensions(extension.Linkify),
	)

	var buf bytes.Buffer
	err := markdown.Convert([]byte(src), &buf)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(buf.String())
}

Prints:

<p>Go to <a href="www.example.com"><a href="http://www.example.com">http://www.example.com</a></a> or <a href="http://www.example.com">http://www.example.com</a>.</p>

Fuzz crash on "[^000]:0\t[^]:"

Please answer the following before submitting your issue:

What version of goldmark are you using? : v1.1.9
What version of Go are you using? : go1.13.4
What operating system and processor architecture are you using? : linux/amd64
What did you do? : make fuzz
What did you expect to see? : boredom
What did you see instead? :

sh$ cat fuzz/crashers/374f2bf4f9cd8bb2d4737a8bcb30f74ea5ef9e10.quoted
        "[^000]:0\t[^]:"

sh$ cat fuzz/crashers/374f2bf4f9cd8bb2d4737a8bcb30f74ea5ef9e10.output
panic: runtime error: slice bounds out of range [:14] with capacity 13

goroutine 1 [running]:
github.com/yuin/goldmark/text.(*Segment).Value(0xc00026edc0, 0x7f2e89c6c000, 0xd, 0xd, 0x7f2e89c6c009, 0x0, 0x0)
        /go/src/github.com/yuin/goldmark/text/segment.go:44 +0x33f
github.com/yuin/goldmark/text.(*reader).Value(0xc0001bd7a0, 0xe, 0xe, 0x0, 0x0, 0xd, 0x3)
        /go/src/github.com/yuin/goldmark/text/reader.go:106 +0x62
github.com/yuin/goldmark/extension.(*footnoteBlockParser).Open(0x7e2060, 0x689480, 0xc000125f40, 0x687780, 0xc0001bd7a0, 0x687840, 0xc0001bd880, 0x1, 0xc000125f40, 0x8)
        /go/src/github.com/yuin/goldmark/extension/footnote.go:55 +0x294
github.com/yuin/goldmark/parser.(*parser).openBlocks(0xc0001d4000, 0x689480, 0xc000125f40, 0x0, 0x687780, 0xc0001bd7a0, 0x687840, 0xc0001bd880, 0x2)
        /go/src/github.com/yuin/goldmark/parser/parser.go:908 +0x481
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc0001d4000, 0x688040, 0xc00001ef80, 0x687780, 0xc0001bd7a0, 0x687840, 0xc0001bd880)
        /go/src/github.com/yuin/goldmark/parser/parser.go:1008 +0x218
github.com/yuin/goldmark/parser.(*parser).Parse(0xc0001d4000, 0x687780, 0xc0001bd7a0, 0x0, 0x0, 0x0, 0x30, 0x632200)
        /go/src/github.com/yuin/goldmark/parser/parser.go:818 +0x1eb
github.com/yuin/goldmark.(*markdown).Convert(0xc0001c88c0, 0x7f2e89c6c000, 0xd, 0xd, 0x6849e0, 0xc00007d9b0, 0x0, 0x0, 0x0, 0x24f90bed, ...)
        /go/src/github.com/yuin/goldmark/markdown.go:116 +0xac
github.com/yuin/goldmark/fuzz.Fuzz(0x7f2e89c6c000, 0xd, 0xd, 0x3)
        /go/src/github.com/yuin/goldmark/fuzz/fuzz.go:23 +0x269
go-fuzz-dep.Main(0xc00026ff48, 0x1, 0x1)
        go-fuzz-dep/main.go:36 +0x1ad
main.main()
        github.com/yuin/goldmark/fuzz/go.fuzz.main/main.go:15 +0x52
exit status 2

BlockParser logics

Thank you for the excellent markdown processor. This is really very impressive.

Could you please help me to understand the logic of BlockParser.

For example, I want to implement the behaviour of (nesting) lists/blockquotes with help of markers, i.e

%START%

Content 1

%START%

Content2

%FINISH%

Content 3

%FINISH%

should produce

<START>
Content 1
<START>
Content 2
</FINISH>
Content 3
</FINISH>

But when I write something like this, I am a little bit confused. It interrupts the parsing along with the first parser.Close call. But blockquote works well and there could be multiple parser.Close calls during the nested blockquotes parsing cycle.

provide Katex support

@KaTeX

Release Notes

Please add Release Notes to your releases!

You may borrow my script/tools that I use for this tools/release.sh

This will help a lot for folks watching releases and not all commits :)

API question

Ref this interface:

// A Markdown interface offers functions to convert Markdown text to
// a desired format.
type Markdown interface {
	// Convert interprets a UTF-8 bytes source in Markdown and write rendered
	// contents to a writer w.
	Convert(source []byte, writer io.Writer, opts ...parser.ParseOption) error

	// Parser returns a Parser that will be used for conversion.
	Parser() parser.Parser

	// SetParser sets a Parser to this object.
	SetParser(parser.Parser)

	// Parser returns a Renderer that will be used for conversion.
	Renderer() renderer.Renderer

	// SetRenderer sets a Renderer to this object.
	SetRenderer(renderer.Renderer)
}

With the above, I can create a Markdown with a custom parser and renderer (I'm not sure what the setters are for) and then run Convert to do the job.

A big win (ref. your benchmarks) when you have this strict separation between parse and render, is to parse once and render to every format you need. I don't see how that is possible with the current API?

Support footnote return links

goldmark v1.1.7
with Hugo 0.60.0

As mentioned in gohugoio/hugo/issues/6551 Goldmark seems to not support footnote return links although they are supported by PHP Markdown extra.

It would be great if Goldmark supported them.

Thank you very much in advance.

Markdown:

That's some text with a footnote.[^1]

[^1]: And that's the footnote.

Output:

…
<section class="footnotes" role="doc-endnotes"><hr><ol><li id="fn:1" role="doc-endnote"><p>And that's the footnote.</p></li></ol></section>

Rendering "class" attribute

Hi @yuin
I'm trying to append class="..." to all img tags and wondering if something like this would make sense to add (of course it's simply a hard-coded example for "class" attribute only) :

zzwx-forks@11441f5

This way users wouldn't have to completely rewrite render function in case something simple as adding a class is needed and they don't want to possibly break the code when the library gets updated.

This is my use case:

case *ast.Image:
  if entering {
    n.SetAttributeString("class", "img-fluid")
  }

Hard line breaks not rendered in files with Windows-style line endings

Hello

Member of the Hugo team here. Currently testing Goldmark as the new default in Hugo 0.60.0 DEV.

Apparently hard line breaks as specified in Commonmark 0.29 are not rendered by Golmark for markdown files with Windows-style line endings.

In a collaborative project that I maintain files can be edited by other team members on Windows.
Typically we use two spaces for a line break.

But I only managed to render the line break after using dos2unix to convert the line endings from DOS to UNIX like so: dos2unix some-file.md.

cc: @bep

Special Designed list_item may cause goldmark to infinite loop

Sample:

*[TAB]A
[space][space][space][space]B

Heading attribute panics

The source below is taken from the README.

package main

import (
	"bytes"

	"github.com/yuin/goldmark"
	"github.com/yuin/goldmark/parser"
)

func main() {
	md := goldmark.New(
		goldmark.WithParserOptions(
			parser.WithAttribute(),
		),
	)
	source := []byte(`
## heading {#id .className attrName=attrValue class="class1 class2"}
`)
	var buf bytes.Buffer
	if err := md.Convert(source, &buf); err != nil {
		panic(err)
	}
}

Panics:

panic: interface conversion: interface {} is [][]uint8, not []uint8

goroutine 1 [running]:
github.com/yuin/goldmark/renderer/html.(*Renderer).RenderAttributes(0xc00009cae0, 0x12262c0, 0xc000119d80, 0x1227860, 0xc0001b2000)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/renderer/html/html.go:513 +0x227
github.com/yuin/goldmark/renderer/html.(*Renderer).renderHeading(0xc00009cae0, 0x12262c0, 0xc000119d80, 0xc000184140, 0x49, 0x49, 0x1227860, 0xc0001b2000, 0x2001, 0x8, ...)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/renderer/html/html.go:208 +0x11f
github.com/yuin/goldmark/renderer.(*renderer).Render.func2(0x1227860, 0xc0001b2000, 0x1, 0x0, 0x12262c0, 0xc000119d80)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/renderer/renderer.go:167 +0x108
github.com/yuin/goldmark/ast.Walk(0x1227860, 0xc0001b2000, 0xc000175e30, 0x3, 0x0)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/ast/ast.go:433 +0x43
github.com/yuin/goldmark/ast.Walk(0x12273e0, 0xc00011c780, 0xc000175e30, 0xc0001b8000, 0x0)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/ast/ast.go:439 +0x149
github.com/yuin/goldmark/renderer.(*renderer).Render(0xc0001245f0, 0x12243c0, 0xc0000909f0, 0xc000184140, 0x49, 0x49, 0x12273e0, 0xc00011c780, 0xc000124501, 0xc0000909f0)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/renderer/renderer.go:162 +0x13c
github.com/yuin/goldmark.(*markdown).Convert(0xc000119a00, 0xc000184140, 0x49, 0x49, 0x12243c0, 0xc0000909f0, 0x0, 0x0, 0x0, 0xc000064058, ...)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/markdown.go:117 +0xe3

Bernchmarks

See https://github.com/bep/markdown-benchmarks

I borrowed your tests and tried to make them as similar as possible + added some more.

Feel free to grab the code if you want.

GoldMark is doing well.

Apostrophe breaks Linkify

Using this file:

package main
import (
   "bytes"
   gm "github.com/yuin/goldmark"
   "github.com/yuin/goldmark/extension"
)
func main() {
   var s1 = []byte("https://github.com/sunday's")
   var s2 bytes.Buffer
   gm.New(gm.WithExtensions(extension.Linkify)).Convert(s1, &s2)
   print(s2.String())
}

I get this result:

<p><a href="https://github.com/sunday">https://github.com/sunday</a>'s</p>

With github.com parser, I get this result:

<p><a href="https://github.com/sunday's">https://github.com/sunday's</a></p>

Example:

https://github.com/sunday's

Last backtick appears to escape in fenced code blocks

Hi there,

Thanks for spending the time to make this! This is super useful, and the extensibility is a great feature not easily found elsewhere. I had one issue, I'm not sure if this is a bug or a side effect, but here it goes. In fenced code blocks, it appears that the last backtick escapes.

So, for example:

    ```
    function lorem(ipsum, dolor = 1) {
      const sit = ipsum == null ? 0 : ipsum.sit;
      dolor = sit - amet(dolor);
      return sit ? consectetur(ipsum, 0, dolor < 0 ? 0 : dolor) : [];
    }

    function adipiscing(...elit) {
      if (!elit.sit) {
        return [];
      }
    
      const sed = elit[0];
      return eiusmod.tempor(sed) ? sed : [sed];
    }

    function incididunt(ipsum, ut = 1) {
      ut = labore.et(amet(ut), 0);
      const sit = ipsum == null ? 0 : ipsum.sit;

      if (!sit || ut < 1) {
        return [];
      }

      let dolore = 0;
      let magna = 0;
      const aliqua = new eiusmod(labore.ut(sit / ut));

      while (dolore < sit) {
        aliqua[magna++] = consectetur(ipsum, dolore, (dolore += ut));
      }
    
      return aliqua;
    }
    ```

Ends up being rendered as:
——————————————————————————————

function lorem(ipsum, dolor = 1) {
  const sit = ipsum == null ? 0 : ipsum.sit;
  dolor = sit - amet(dolor);
  return sit ? consectetur(ipsum, 0, dolor < 0 ? 0 : dolor) : [];
}

function adipiscing(...elit) {
  if (!elit.sit) {
    return [];
  }

  const sed = elit[0];
  return eiusmod.tempor(sed) ? sed : [sed];
}

function incididunt(ipsum, ut = 1) {
  ut = labore.et(amet(ut), 0);
  const sit = ipsum == null ? 0 : ipsum.sit;

  if (!sit || ut < 1) {
    return [];
  }

  let dolore = 0;
  let magna = 0;
  const aliqua = new eiusmod(labore.ut(sit / ut));

  while (dolore < sit) {
    aliqua[magna++] = consectetur(ipsum, dolore, (dolore += ut));
  }

  return aliqua;
}

`
——————————————————————————————
^ superfluous last backtick

This is the actual code fragment that is generated by above:

<pre style="color:#93a1a1;background-color:#002b36"><span style="color:#268bd2">function</span> lorem(ipsum, dolor <span style="color:#719e07">=</span> <span style="color:#2aa198">1</span>) {
  <span style="color:#268bd2">const</span> sit <span style="color:#719e07">=</span> ipsum <span style="color:#719e07">==</span> <span style="color:#cb4b16">null</span> <span style="color:#719e07">?</span> <span style="color:#2aa198">0</span> <span style="color:#719e07">:</span> ipsum.sit;
  dolor <span style="color:#719e07">=</span> sit <span style="color:#719e07">-</span> amet(dolor);
  <span style="color:#719e07">return</span> sit <span style="color:#719e07">?</span> consectetur(ipsum, <span style="color:#2aa198">0</span>, dolor <span style="color:#719e07">&lt;</span> <span style="color:#2aa198">0</span> <span style="color:#719e07">?</span> <span style="color:#2aa198">0</span> <span style="color:#719e07">:</span> dolor) <span style="color:#719e07">:</span> [];
}

<span style="color:#268bd2">function</span> adipiscing(...elit) {
  <span style="color:#719e07">if</span> (<span style="color:#719e07">!</span>elit.sit) {
    <span style="color:#719e07">return</span> [];
  }

  <span style="color:#268bd2">const</span> sed <span style="color:#719e07">=</span> elit[<span style="color:#2aa198">0</span>];
  <span style="color:#719e07">return</span> eiusmod.tempor(sed) <span style="color:#719e07">?</span> sed <span style="color:#719e07">:</span> [sed];
}

<span style="color:#268bd2">function</span> incididunt(ipsum, ut <span style="color:#719e07">=</span> <span style="color:#2aa198">1</span>) {
  ut <span style="color:#719e07">=</span> labore.et(amet(ut), <span style="color:#2aa198">0</span>);
  <span style="color:#268bd2">const</span> sit <span style="color:#719e07">=</span> ipsum <span style="color:#719e07">==</span> <span style="color:#cb4b16">null</span> <span style="color:#719e07">?</span> <span style="color:#2aa198">0</span> <span style="color:#719e07">:</span> ipsum.sit;

  <span style="color:#719e07">if</span> (<span style="color:#719e07">!</span>sit <span style="color:#719e07">||</span> ut <span style="color:#719e07">&lt;</span> <span style="color:#2aa198">1</span>) {
    <span style="color:#719e07">return</span> [];
  }

  <span style="color:#268bd2">let</span> dolore <span style="color:#719e07">=</span> <span style="color:#2aa198">0</span>;
  <span style="color:#268bd2">let</span> magna <span style="color:#719e07">=</span> <span style="color:#2aa198">0</span>;
  <span style="color:#268bd2">const</span> aliqua <span style="color:#719e07">=</span> <span style="color:#719e07">new</span> eiusmod(labore.ut(sit <span style="color:#719e07">/</span> ut));

  <span style="color:#719e07">while</span> (dolore <span style="color:#719e07">&lt;</span> sit) {
    aliqua[magna<span style="color:#719e07">++</span>] <span style="color:#719e07">=</span> consectetur(ipsum, dolore, (dolore <span style="color:#719e07">+=</span> ut));
  }

  <span style="color:#719e07">return</span> aliqua;
}
</pre><p>`</p>

Thanks for taking a look!

Typographic elements in heading are excluded from the automatically generated heading IDs

Hello,

For background and related discussion, please see the following post in Hugo forum.

https://discourse.gohugo.io/t/difference-in-auto-generated-heading-anchor-names-between-previous-versions-and-v0-60-x/22076

Please answer the following before submitting your issue:

What version of goldmark are you using? : 1.1.8 (included in Hugo 0.60.1)
What version of Go are you using? : 1.11.2 (but shouldn't matter as the test is done with pre-built Hugo)
What operating system and processor architecture are you using? : macOS 10.13.6, Intel Core i5
What did you do? : Upgrade Hugo from 0.54.0 to 0.60.1 to check the basic functionality
What did you expect to see? : Non-alphanumeric typhographic elements (hyphen, period, underscore, etc.) in heading are transformed into hyphen in the auto heading IDs (e.g. for heading "Command-Gen-Instance" and "v1.0.0 (Apr 21, 2019)", the results are command-gen-instance and v1-0-0-apr-21-2019)
What did you see instead? : Non-alphanumeric typhographic elements in heading are excluded from the auto heading IDs (e.g. for the example above, the results are commandgeninstance and v100-april-21-2019)

Many thanks for your work with Goldmark.

Apostrophes in contractions are not converted to right single quote

For the following text:

I'm going to see my mother. She's very nice.

Currently, ' is not converted to ’ for contractions when the typography extension is enabled, but smartypants does. I would expect the output to be:

I&rsquo;m going to see my mother. She&rsquo;s very nice.

Consider adding a context (data holder) to Render

This is a follow up to #37

So, setting state on the nodes in the AST and then use that while rendering works, but ...

It makes for some fairly clumsy and verbose code
It breaks the separation of concerns (adding rendering code to the parser)

What I'm now doing instead is something ala:

        w := renderContext{
		BufWriter: bufio.NewWriter(buf),
		renderContextData: renderContextDataHolder{
			rctx: ctx,
			dctx: c.ctx,
		},
	}

	if err := c.md.Renderer().Render(w, ctx.Src, doc); err != nil {
		return nil, err
	}

This works great , and I don't mind doing it like this (this is entirely internal), but the down side is that it may stop working in the future if you decide to wrap the writer or something.

An infinite loop in ASTTransformer

func (*test) Transform(node *ast.Document, reader text.Reader, pc parser.Context) {
  walk(node)
}

func walk(node ast.Node) {
  for n := node.FirstChild(); n != nil; n = node.NextSibling() {
    walk(n)
  }
}

Footnote parsing error

test![^1]

[^1]: footnote

<p>test![^1]</p>

This happens if an exclamation mark is placed before the footnote link.

Comma breaks Linkify

Using this file:

package main
import (
   "bytes"
   gm "github.com/yuin/goldmark"
   "github.com/yuin/goldmark/extension"
)
func main() {
   var s1 = []byte("https://github.com#sun,mon")
   var s2 bytes.Buffer
   gm.New(gm.WithExtensions(extension.Linkify)).Convert(s1, &s2)
   print(s2.String())
}

I get this result:

<p><a href="https://github.com#sun">https://github.com#sun</a>,mon</p>

With github.com parser, I get this result:

<p><a href="https://github.com#sun,mon">https://github.com#sun,mon</a></p>

Example:

https://github.com#sun,mon

Fenced code block with carriage returns causes a panic error

Hey!

I am trying to "markdownify" input coming from an HTML textarea, and it contains carriage returns.

Using a fenced code block with carriage returns cause the whole program to panic with a slice bounds out of range error.

Here is an example:

package main

import (
	"bytes"
	"fmt"
	"html/template"

	"github.com/yuin/goldmark"
)

func main() {
	var buf bytes.Buffer
	if err := goldmark.Convert([]byte("lol\r\n\r\n```\r\nok\r\n```\r\n\r\nyes"), &buf); err != nil {
		panic(err)
	}
	fmt.Printf("%v", template.HTML(buf.String()))
}

panic: runtime error: slice bounds out of range

goroutine 1 [running]:
github.com/yuin/goldmark/parser.(*fencedCodeBlockParser).Open(0x8336e0, 0x6b0340, 0xc000102600, 0x6afa80, 0xc00009a7e0, 0x6af880, 0xc00009a8a0, 0x0, 0x0, 0x10)
        /home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/parser/fcode_block.go:51 +0x419
github.com/yuin/goldmark/parser.(*parser).openBlocks(0xc000172000, 0x6b0340, 0xc000102600, 0x1, 0x6afa80, 0xc00009a7e0, 0x6af880, 0xc00009a8a0, 0x3)
        /home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/parser/parser.go:849 +0x27e
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc000172000, 0x6b0340, 0xc000102600, 0x6afa80, 0xc00009a7e0, 0x6af880, 0xc00009a8a0)
        /home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/parser/parser.go:923 +0x1ce
github.com/yuin/goldmark/parser.(*parser).Parse(0xc000172000, 0x6afa80, 0xc00009a7e0, 0x0, 0x0, 0x0, 0x20, 0x62e2c0)
        /home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/parser/parser.go:771 +0x157
github.com/yuin/goldmark.(*markdown).Convert(0xc00017a000, 0xc0000b6a00, 0x1a, 0x1a, 0x6ab8a0, 0xc00007ad20, 0x0, 0x0, 0x0, 0x0, ...)
        /home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/markdown.go:116 +0x94
github.com/yuin/goldmark.Convert(...)
        /home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/markdown.go:31
main.main()
        /home/thomas/Proj/blobstash/cmd/lol/p.go:13 +0xbc
exit status 2

Thanks!

Pandoc Markdown Compatibility?

Goldmark's CommonMark compatibility is amazing and with attribute support, the mathjax extension, and the metadata extension covers what I feel are the most popular parts of Pandoc Markdown. It seems very possible that Goldmark could eventually replace external pandoc dependencies in many Go applications today. I would very much like to contribute toward that goal and I'm aware of others who would be also.

To that end I am seeking some design direction and consensus about how to move forward.

Extensions for each feature seems most reasonable. But I opened this issue to make sure a full AST Transformer might not be a better approach. Personally I prefer the modularity of an extension for each --- particularly Pandoc's unique Simplified Tables --- and find the composition design valuable that Pandoc has used for its internals.

Which design direction is most recommended for such work? Several extensions or a single Transformer? I'm almost sure the answer is extensions but am asking anyway to avoid something I may have missed.

Thank you. (If there is a better place to have this discussion please let me know.)

question: Passing state to a rendering extension

I'm in the process of creating some link/image extensions that would allow for link resolution/image resize etc.

For that to work, I need to pass on some document state to the custom link renderer. But I don't see how.

The Parse method can take a context, but I don't see a similar way to pass a struct via Render. I could create a new goldmark.Markdown for each document, but that sounds wasteful.

Add some kind of "non-rendering render hook"

In working on adding this to Hugo, I wanted to implement ToC in a general way that we could possibly also use for other things; e.g. a "content map" with byte slice pointers (start/stop) into the rendered content.

I experimented by creating an extension:

https://github.com/bep/hugo/blob/goldmark2/markup/goldmark/contentmap.go#L35

But that doesn't work, as I notice that you pick up the first renderer for a given node kind.

Note that for the ToC thing (which is what Hugo has today), I can traverse the AST and build the ToC from that, but it would be really useful if could somehow register the rendered start/stop position for the different blocks; so people could do things like:

Split content over multiple pages
Insert ads/bylines etc.
...

Again, thanks for this library, it's really easy to use.

No EOL at the end of file breaks processing

Test case: https://github.com/mironovalexey/gm-test (eof package).

Rather a question

Thank you for great work on this library! I've been looking for this clean implementation that works out-of-box without any patches.

Now for a project I'm working on I need a little bit of hacking of default renderer. Would you please direct me as to where I would plug my custom rendering of a youtube auto-links. Basically I'm having them as *ast.AutoLink nodes. Now I'm trying to rewrite the rendering of those so that they appear in <div>s with the <img> of the youtube preview picture and <a> leading to the video. That's the idea.

So far I've been able to declare a custom type:

// CustomGoldmarkRenderer renders specific markdown documents containing video links
type CustomGoldmarkRenderer struct {
	defaultRenderer renderer.Renderer
	file            *[]byte
}

which then I make implementing the Renderer interface:

func (c CustomGoldmarkRenderer) Render(w io.Writer, source []byte, n ast.Node) error {
	ast.Walk(n, func(n ast.Node, entering bool) (status ast.WalkStatus, err error) {
		switch t := n.(type) {
		case *ast.AutoLink:
			url := string(t.URL(*c.file))
			matches := youTubeLinkRegex.FindAllStringSubmatch(url, -1)
			if len(matches) == 0 {
				// Or try a short link
				matches = youTubeShortLinkRegex.FindAllStringSubmatch(url, -1)
			}
			if len(matches) > 0 {
				videoID := matches[0][1] // Group 1 stands for the first (...) block
				if entering {
					fmt.Fprintf(w, `
					<div class="py-2 col-12 col-xl-3 col-lg-3 col-md-4 mb-2">
						<a href="%s" target="_blank" class="d-block h-180">
							<img class="img-fluid img-thumbnail rounded" src="%s" alt="%s"/>
						</a>
						%s`,
						url,
						"https://img.youtube.com/vi/"+videoID+"/mqdefault.jpg",
						"title",
						"titleHTML")
					return ast.WalkSkipChildren, nil
				} else {
					fmt.Fprintf(w, `</div>`)
				}
			}
		}
		return ast.WalkContinue, nil
	})
	return c.defaultRenderer.Render(w, source, n)
}

so that I'm able to plug my custom renderer into the md := goldmark.New(...) as following:

	md.SetRenderer(CustomGoldmarkRenderer{
		defaultRenderer: md.Renderer(),
		file:            &file,  // Passing original source so that it becomes available in parsing function
	})

	var buf bytes.Buffer
	if err := md.Convert(file, &buf); err != nil {
		panic(err)
	}

Now of course what I get is simply my additional rendering of <div>s that I do with my walker, and then (with no surprise) the ordinary rendering is being appended to the io.Writer when return c.defaultRenderer.Render(w, source, n) comes into play.

Being an amateur coder in Go I can't figure out how to render the rest of the nodes with the default way while I do the rendering in my custom ast.Walk(...) call, node by node, since return c.defaultRenderer.Render(w, source, n) seems to be called just once for the Document node and doesn't really help me with individual nodes at all.

So, Would you be so kind to hint me where I'm wrong and which direction I would rather need to choose?

goldmark can't emphasized the specific Chinese character

After using goldmard process the markdown text **「刻舟求剑」**, the result is still **「刻舟求剑」**, but the expected is 「刻舟求剑」.

What version of goldmark are you using? : v1.1.11
What version of Go are you using? : 1.12
What operating system and processor architecture are you using? : Hugo v0.60.1 on macOS
What did you do? : Create a markdown file with **
What did you expect to see? : The words should be emphasized
What did you see instead? : The words was not emphasized
(Feature request only): Why you can not implement it as an extension?: not applicable

HTML comments can break the processing

https://github.com/mironovalexey/gm-test/tree/master/hcomments

https://github.com/mironovalexey/gm-test/blob/master/hcomments/test.md

Any line between --- and --- breaks the processing.

Autolinks

Other parsers allow for bare autolinks. For example:

http://example.com

returns:

<a href="http://example.com">http://example.com</a>

https://github.github.com/gfm#autolinks-extension-

New lines within span-level elements

When span-level element contain new lines, its content is not treated as markdown.

Test case: https://github.com/mironovalexey/gm-test (html package).

https://github.com/mironovalexey/gm-test/blob/master/html/test.md

yuin / goldmark Goto Github PK

goldmark's People

Contributors

Stargazers

Watchers

Forkers

goldmark's Issues

Recommend Projects

Recommend Topics

Recommend Org