Comments (28)
I will have to think more about it. Right now, it is clear what we would get from it and its benefits, but we still need to decide if we want to go down this route and what coupling it introduces. :) Thank you!
from ex_doc.
Hi @pnezis, thanks for the proposal.
The example you proposed doesn't work semantically speaking (you can't invoke a function before you define it) but I think the embed idea is valid. I am wondering though if we need the embed::
prefix or if we just allow you to hijack anything. If we do the latter, we could even implement the current syntax highlighter as "embeds".
from ex_doc.
Hi @josevalim , thanks for the quick reply
The example you proposed doesn't work semantically speaking (you can't invoke a function before you define it
Correct, semantically the interpolation shouldn't work. It was just a demonstration of what we cannot do and how we can overcome this with the proposed "embed" functionality. Such a functionality could open up the road for auto-generated doc segments, for example you could have special embeds for ecto schemas:
defmodule User do
@moduledoc """
A `User` object
```ecto-schema
function_that_auto_generates_ecto_schema_docs(User)
```
use Ecto.Schema
...
"""
Another example is the mermaid graph in the POC project, where this function doc:
![image](https://private-user-images.githubusercontent.com/1561963/244948614-4a632779-123d-4b79-96ee-11a9ec59898a.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDk5ODQ0NTQsIm5iZiI6MTcwOTk4NDE1NCwicGF0aCI6Ii8xNTYxOTYzLzI0NDk0ODYxNC00YTYzMjc3OS0xMjNkLTRiNzktOTZlZS0xMWE5ZWM1OTg5OGEucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDMwOSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDAzMDlUMTEzNTU0WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9Y2M1ZWY4M2UxMDBkMTIwOGMwODkxYmFlYmFmYzVhZDMxNDhjMGI3NjEzYzhmZjM3ODgyNTFiOTllOTE3YTM0ZiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.HZ9W0dxxfWDZ5yUN8WP1hpdDZCZTYnM_ekPpx9Fis0E)
is transformed to this rich version:
![image](https://private-user-images.githubusercontent.com/1561963/244948660-d513539f-0c82-4509-99af-baa70dbf9fa0.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDk5ODQ0NTQsIm5iZiI6MTcwOTk4NDE1NCwicGF0aCI6Ii8xNTYxOTYzLzI0NDk0ODY2MC1kNTEzNTM5Zi0wYzgyLTQ1MDktOTlhZi1iYWE3MGRiZjlmYTAucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDMwOSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDAzMDlUMTEzNTU0WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9MzVjODgzNDk4NmI2YjZhMjMwMTkxNWZkNDU0MzBlYWFjMzg4NmY1NjI2ODNjYjVhYTRjN2I3NTNkY2FlMTFiZCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.zGV9e4ZCwHa3jisy14uFrC9af1u1JHbV5aAc2Y8LlzY)
I am wondering though if we need the embed:: prefix or if we just allow you to hijack anything. If we do the latter, we could even implement the current syntax highlighter as "embeds".
You are right, we could hijack anything the embed::
prefix was just a POC suggestion. But I think we should limit it to fenced code blocks.
Also embeds
could be exported from dependencies, which you could use directly in your project for enhancing the documentation, for example ecto
could expose an embed
that auto generates schema docs, and vega_lite
could export an embed
that pretty-prints a VegaLite plot.
from ex_doc.
What do you think about this:
-
We introduce the concept of
FenceProcessor
. -
For each fenced code block, we call the
FenceProcessor.process(format, fence, contents)
. You return{:ok, binary} | :skip
. -
We allow folks to register custom fence processors to be called in order. The current syntax highlighter could be a fence processor (which we will use to dog food this API).
from ex_doc.
At which level do you see this to be applied? The syntax highlighter is applied once the original markdown
has been transformed from AST to string and transforms the string to html
.
I would expect this to be also applicable at the AST level. This way the output of the FenceProcessor
could be plain markdown
, e.g.
def process(_format, fence, contents) do
"""
Returns a `markdown` formatted version of:
```elixir
#{contents}
```
"""
end
We could add this in the ExDoc.Markdown.to_ast
, something like:
defmodule ExDoc.Markdown
def to_ast(text, opts \\ []) when is_binary(text) do
{processor, options} = get_markdown_processor()
text
|> processor.to_ast(options |> Keyword.merge(opts))
# add this
|> apply_fence_processors(opts)
end
def apply_fence_processors(ast, opts)
# call FenceProcessor.process on each valid fence block contents
# convert the returned binary to ast and replace the original
# ast's code block with the new one
end
Applying this to the string level is also valid. What if:
- We make
FenceProcessor
a behaviour? - Add two optional callbacks:
ast_process
string_process
We allow folks to register custom fence processors to be called in order.
Shouldn't the fence processors be somehow mapped to the fence language, something like:
fence_processors: [
elixir: [ExDoc.FenceProcessors.ElixirSyntaxHighlighter]
elixir_fancy: [
ExDoc.FenceProcessors.ElixirSyntaxHighlighter,
MyCustomElixirFenceProcessor
],
vega_lite: [
MyVegaLiteFenceProcessor
]
]
What do you think?
from ex_doc.
I don't want to pre-process the markdown because that comes with a lot of complexity.
Shouldn't the fence processors be somehow mapped to the fence language, something like:
Not necessary. The preprocessor will receive all fenced languages and, if it doesn't care about it, it will return :skip
.
from ex_doc.
I don't want to pre-process the markdown because that comes with a lot of complexity.
Not the markdown but the AST, this seems to me straight forward (except if I am missing something). Otherwise how would the FenceProcessor
support complex output? For example look at the implementation of the mermaid
renderer in the demo project.
It returns a mardown string including two fenced blocks:
"""
```elixir
#{code}
```
```mermaid
#{mermaid}
```
"""
How would we handle this if the fence processing takes place at the string level? These fenced blocks should also be "fence processed".
Not necessary. The preprocessor will receive all fenced languages and, if it doesn't care about it, it will return :skip
Correct but you will lose composability this way. Imagine you have an Elixir
fence processor and a generic processor that puts a code block in a fancy div. Now you want to use this fancy div only on some elixir code blocks.
By specifying an elixir_fancy
language
like this:
elixir: [ElixirFenceProcessor]
elixir_fancy: [
ElixirFenceProcessor,
FancyDivFenceProcessor
],
you will be able to have both block types in your codebase
```elixir
# this is the default elixir block
```
```elixir_fancy
# this will use the fancy processor on top of the default elixir processor
```
from ex_doc.
How would we handle this if the fence processing takes place at the string level? These fenced blocks should also be "fence processed".
Then you call the fences API recursively.
Correct but you will lose composability this way.
I am pretty sure you can call the fences API recursively here too. :) We will expose something like ExDoc.FencesProcessor.process(format, language, content)
.
from ex_doc.
Then you call the fences API recursively.
Correct but you will have to implement the markdown to html conversion again internally in your Processor. Right?
from ex_doc.
Yes, but I can't a good way to do pre-processing in Markdown, in a way we can inject HTML (necessary for syntax highlighting) and without coupling to Earmark. And your markdown example could easily return the HTML bits directly, no?
from ex_doc.
For the syntax highlighting use case, I totally agree this is the best way to proceed. But FenceProcessors
could be something much more versatile.
And your markdown example could easily return the HTML bits directly, no?
It could return the HTML bits directly yes but I see some drawbacks here:
- any change on the way
ExDoc
renders markdown would make the processors inconsistent - it wouldn't handle existing functionality e.g. autolinks
- it would make the initial purpose of this proposal, actual code evaluation, very difficult since the transformed string will be escaped HTML.
Let's look at a concrete example. Let's imagine we want to implement a fence processor that:
- Evaluates the inline content
- Returns two fenced elixir blocks the first with the initial content and the second with the evaluated result
```inspect
Enum.map([1, 2, 3], fn x -> 2*x end)
```
Working with the AST
- The implementation is straight forward since the
content
is verbatim the code in the original inspect block
def process(_format, "inspect", content) do
{result, _} = Code.eval_string(code, [], __ENV__)
"""
```elixir
#{code}
```
```elixir
#{inspect(result)}
```
end
- We only have to traverse the markdown ast, apply the processor on the matching
pre
blocks and replace thepre inspect
block:
[
{:pre, [],
[{:code, [class: "inspect"], ["Enum.map([1, 2, 3], fn x -> 2*x end)"], %{}}],
%{}}
]
with the processed one:
[
{:pre, [],
[{:code, [class: "elixir"], ["Enum.map([1, 2, 3], fn x -> 2*x end)"], %{}}],
%{}},
{:pre, [], [{:code, [class: "elixir"], ["[2, 4, 6]"], %{}}], %{}}
]
- The output of the
process
function can be any markdown text, no need to handle the html internally, everything will be handled byex_doc
, autolinks will also work
Working on the string level
If I have understood correctly we will call the processors on the DocAST.to_string
output so in this example the string will be:
<pre><code class=\"inspect\">Enum.map([1, 2, 3], fn x -> 2*x end)</code></pre>
which means that the content
part of the process
will be Enum.map([1, 2, 3], fn x -> 2*x end)
In this case we need to
- unescape the string
- apply the processor's logic
- manually generate the html bits
What if we support both, e.g. add make the process
signature:
FenceProcessor.process(format, fence, mode, contents)
where mode
is one of :ast
, :string
?
Sorry for the back and forth, I am probably missing something.
from ex_doc.
We only have to traverse the markdown ast
I found the point of confusion. That's not the markdown ast, that's html. :) My proposal is that we will indeed match on the HTML AST but we will require an HTML string to be returned.
The output of the process function can be any markdown text, no need to handle the html internally, everything will be handled by ex_doc, autolinks will also work
Well, except it doesn't just work in corner cases. For example, what happens if the code you are inspect contains ```
in it? Now you need to escape it. I think Markdown is tricky as an auto-generated format due to all of its ambiguities.
Working with HTML is not much more complex, your inspect snippet would be implemented as:
def process(_format, "inspect", content) do
{result, _} = Code.eval_string(code, [], __ENV__)
with {:ok, content} <- ExDoc.FenceProcessor.format(format, "elixir", content),
{:ok, result} <- ExDoc.FenceProcessor.format(format, "elixir", inspect(result)),
do: {:ok, content <> result}
end
from ex_doc.
found the point of confusion. That's not the markdown ast, that's html. :)
That makes sense now, I was sure I was confused :)
Working with HTML is not much more complex
How would you handle a processor that wants to return something like:
def process(_format, "inspect", _content) do
"""
This is a **rich** response, containing
* lists
* with sublists
> Some quoted text
### A header
This is an `Enum` (with an autolink) example:
```elixir
Enum.map([1, 2, 3], fn x -> 2*x end
```
"""
end
For example, what happens if the code you are inspect contains ``` in it? Now you need to escape it.
Valid point
I think Markdown is tricky as an auto-generated format due to all of its ambiguities.
I agree but the docs are already in markdown. Also a lot of libraries have helper functions that generate markdown docs, take for example NimbleOptions.docs
. You could use it directly in a fenced block without having to convert it to html.
def process(_format, "nimble_options", content) do
{schema, _} = Code.eval_string(content, [], __ENV__)
NimbleOptions.docs(schema)
end
from ex_doc.
How would you handle a processor that wants to return something like:
Do we have valid use cases for it? Perhaps the NimbleOptions is one but we can as easily interpolate it at compile-time nowadays?
from ex_doc.
It's me again @josevalim, now that I am thinking it again, if we expose an ExDoc.markdown_to_html
we could handle both use cases:
def process(_format, "inspect", _content) do
"""
This is a **rich** response, containing
* lists
* with sublists
> Some quoted text
### A header
This is an `Enum` (with an autolink) example:
```elixir
Enum.map([1, 2, 3], fn x -> 2*x end
```
"""
|> ExDoc.markdown_to_html()
end
from ex_doc.
Perhaps. The question is: can we expose something like that? I am afraid auto-linking requires further context that comes just later in the stack but I may be wrong.
from ex_doc.
Do we have valid use cases for it? Perhaps the NimbleOptions is one but we can as easily interpolate it at compile-time nowadays?
We can interpolate it at compile time if you define the schema as a module attribute. I often define schemas in a callback with complex logic which makes it impossible to interpolate. So I call the docs
in another module and not the one where the schema is defined in order for interpolation to work.
Another use case is auto-generated documentation for ecto schemas. Currently I am having a function similar to NimbleOptions.docs
:
defmodule MySchema do
@moduledoc """
#{ SchemaDocs.docs(__ENV__.file, "my_schema")}
"""
end
which loads the file, traverses the AST, and auto-generates the schema docs.
from ex_doc.
I am afraid auto-linking requires further context that comes just later in the stack but I may be wrong.
We can do it without having the context of the module. Which means local references won't work, which will likely break usage within NimbleOptions
and perhaps Ecto schemas, so I don't think that's a viable path.
It seems in order to make this work we would have to couple several different parts of ExDoc and I am not sure the costs justify the benefits.
from ex_doc.
We can do it without having the context of the module. Which means local references won't work, which will likely break usage within
NimbleOptions
and perhaps Ecto schemas, so I don't think that's a viable path.
If you do the injection directly in the Markdown.to_ast
this will work. Proof of concept:
@doc """
A `NimbleOptions` schema
```nimble_options
MyModule.schema()
```
"""
def schema do
[
name: [
required: true,
type: :atom,
doc: "See also `echo_name/1`"
]
]
end
![image](https://private-user-images.githubusercontent.com/1561963/245112408-a138529c-c476-4ede-962d-3ba9d530b6c9.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDk5ODQ0NTQsIm5iZiI6MTcwOTk4NDE1NCwicGF0aCI6Ii8xNTYxOTYzLzI0NTExMjQwOC1hMTM4NTI5Yy1jNDc2LTRlZGUtOTYyZC0zYmE5ZDUzMGI2YzkucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDMwOSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDAzMDlUMTEzNTU0WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9YWE3MjhmNTZhZmY2NjM5ODM2ZmI2ZjI4MDM4YTEwMmJhYmZiOTEyZDAwNTBiZWUyNWVmYzhlMzk1NWY0YWYyMCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.z5aEK8GOpk-Iv8H3jtPGO5oSxZM--BhAkbHsOkuJDHI)
Would you like to see a PR first and discuss the details there?
from ex_doc.
Yes, let's see a PR, because I can't see a straight-forward way of doing that, but maybe there is. :)
from ex_doc.
Thanks for the POC! I was missing that the autolinking actually happens in the HTML AST, so it is all good. I think we can do this contract:
{:html, content} | {:markdown, content} | {:next, content}
The first means we returned HTML, we return markdown, or we return updated content to continue to go through the pipeline (solving the recursion case).
WDYT?
from ex_doc.
I am not that familiar with the ex_doc
codebase so correct me if I am wrong. I think the function under discussion is this one:
defp autolink_and_render(doc, language, autolink_opts, opts) do
doc # This is an HTML AST
|> language.autolink_doc(autolink_opts)
|> ExDoc.DocAST.to_string()
|> ExDoc.DocAST.highlight(language, opts)
end
- If we want to support markdown expansion we need to update the HTML ast before the
autolink_doc
call - The syntax highlighting will take place after it.
So we need to expand the fences twice and we need to indicate at which point of the pipeline we are:
defp autolink_and_render(doc, language, autolink_opts, opts) do
doc
|> apply_fence_processing(:markdown) # apply markdown expansion and update the AST like in the example
|> language.autolink_doc(autolink_opts)
|> ExDoc.DocAST.to_string() # this will also apply fence processing internally and the current highlight
# function will be an `elixir` FenceProcessor
end
So should the signature of the process
be:
@spec process(format :: binary(), fence :: binary(), content :: {:html | :markdown, binary()} )
from ex_doc.
You are right. So basically my idea of integrating Makeup into this wouldn't actually work. We need to keep them as distinct operations.
from ex_doc.
Would you be interested on the functionality proposed by the PR #1719? IMHO it would help with auto-generated doc segments in many places.
If yes I can polish the existing PR, and create a new one for the syntax highlighting part.
from ex_doc.
Hi @josevalim, thinking it again I don't think this should be added as default behaviour to ex_doc
. Since you can define markdown_processor
I could just create an earmark
wrapper that post-processes the ast
.
from ex_doc.
@pnezis that's actually a great idea. Please give it a try and let me know if you run into any issues :)
from ex_doc.
@josevalim it works like a charm :)
Wrapper: https://github.com/pnezis/fancy_fences
Sample config on the POC project :
markdown_processor:
{FancyFences,
[
fences: %{
"inspect" => {FenceProcessors, :io_inspect, []},
"nimble_options" => {FenceProcessors, :nimble_options, []},
...
}
]},
Thanks for all the help, feel free to close the issue
from ex_doc.
❤️ !!!
from ex_doc.
Related Issues (20)
- Auto link with anchor is not working HOT 6
- Mobile regression: menu icon no longer sticky HOT 6
- URL in browser not updating when clicking link icon on iOS HOT 3
- @since tag not rendered correctly for Erlang source Types? HOT 1
- Add a title attribute to sidebar links HOT 13
- Search is not working for some terms HOT 1
- Sidebar overflow is not showing a scrollbar on my browsers HOT 1
- SVG logo is not shown HOT 3
- Getting an invalid epub format
- EPUB: Links to anchors like #&&/2 cause a fatal error while parsing the file
- Autolink private dependencies HOT 1
- UX issues related to search bar location HOT 6
- Linebreaks in the middle of `++` HOT 8
- Head of document missing when printing HOT 3
- Equivalent for edocs's {sort_functions, false} HOT 5
- Admonitions not styled properly within cheat sheets HOT 7
- Text of code block within a info admonition block is barely legible in darkmode HOT 5
- Allow adding *.exs files to the docs HOT 2
- Introduce meta tags to customize search HOT 1
- Search thinks there are ids for <h4> HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ex_doc.