Code Monkey home page Code Monkey logo

Comments (4)

Gewerd-Strauss avatar Gewerd-Strauss commented on September 20, 2024

I tracked the issue down to the fact that latex blocks may contain inline code. Currently, inline code gets replaced before latex blocks:

    def StripCodeSections(self):
            """(Temporarily) Remove codeblocks/-lines so that they are not altered in all the conversions. Placeholders are inserted."""
            self.codeblocks = re.findall(r"^```([\s\S]*?)```[\s]*?$", self.page, re.MULTILINE)
            for i, match in enumerate(self.codeblocks):
                self.page = self.page.replace("```" + match + "```", f"%%%codeblock-placeholder-{i}%%%")
    
            self.codelines = re.findall("`(.*?)`", self.page)
            for i, match in enumerate(self.codelines):
                self.page = self.page.replace("`" + match + "`", f"%%%codeline-placeholder-{i}%%%")
    
            self.latexblocks = re.findall(r"^\$\$([\s\S]*?)\$\$[\s]*?$", self.page, re.MULTILINE)
            for i, match in enumerate(self.latexblocks):
                self.page = self.page.replace("$$" + match + "$$", f"%%%latexblock-placeholder-{i}%%%")

but the reverse is not true for the reinsertionRestoreCodeSections(self):

   def RestoreCodeSections(self):
        """Undo the action of StripCodeSections."""
        for i, value in enumerate(self.codeblocks):
            self.page = self.page.replace(f"%%%codeblock-placeholder-{i}%%%", f"```{value}```\n")
        for i, value in enumerate(self.codelines):
            self.page = self.page.replace(f"%%%codeline-placeholder-{i}%%%", f"`{value}`")
        for i, value in enumerate(self.latexblocks):
            self.page = self.page.replace(f"%%%latexblock-placeholder-{i}%%%", f"$${value}$$")

The replacement parts are correctly stored in self.codelines, and everything is parsed correctly-ish. So, to resolve this issue you just rearrange the order of operations when restoring code sections:

   def RestoreCodeSections(self):
        """Undo the action of StripCodeSections."""
        for i, value in enumerate(self.codeblocks):
            self.page = self.page.replace(f"%%%codeblock-placeholder-{i}%%%", f"```{value}```\n")
        for i, value in enumerate(self.latexblocks):
            self.page = self.page.replace(f"%%%latexblock-placeholder-{i}%%%", f"$${value}$$")
        for i, value in enumerate(self.codelines):
            self.page = self.page.replace(f"%%%codeline-placeholder-{i}%%%", f"`{value}`")

However, in principle you could nest inline code into a codeblock the same way you do with a latex block. In that case, the above solution would fail for the same reason the master branch fails for in this issue. So, you could reorder both RestoreCodeSections() and StripCodeSections() to

    def StripCodeSections(self):
        """(Temporarily) Remove codeblocks/-lines so that they are not altered in all the conversions. Placeholders are inserted."""
        self.codelines = re.findall("`(.*?)`", self.page)
        for i, match in enumerate(self.codelines):
            self.page = self.page.replace("`" + match + "`", f"%%%codeline-placeholder-{i}%%%")

        self.codeblocks = re.findall(r"^```([\s\S]*?)```[\s]*?$", self.page, re.MULTILINE)
        for i, match in enumerate(self.codeblocks):
            self.page = self.page.replace("```" + match + "```", f"%%%codeblock-placeholder-{i}%%%")

        self.latexblocks = re.findall(r"^\$\$([\s\S]*?)\$\$[\s]*?$", self.page, re.MULTILINE)
        for i, match in enumerate(self.latexblocks):
            self.page = self.page.replace("$$" + match + "$$", f"%%%latexblock-placeholder-{i}%%%")

    def RestoreCodeSections(self):
        """Undo the action of StripCodeSections."""
        for i, value in enumerate(self.latexblocks):
            self.page = self.page.replace(f"%%%latexblock-placeholder-{i}%%%", f"$${value}$$")
        for i, value in enumerate(self.codeblocks):
            self.page = self.page.replace(f"%%%codeblock-placeholder-{i}%%%", f"```{value}```\n")
        for i, value in enumerate(self.codelines):
            self.page = self.page.replace(f"%%%codeline-placeholder-{i}%%%", f"`{value}`")

Issue here again is that in theory, if you flip it upside down you could have code blocks containing latex blocks. Not sure if there's an actual application where that would be viable, especially as this would be extremely dependant on the markdown postprocessor used.

I'd argue that especially codelines in latex blocks are a valid option to support, given that f.e. RMarkdown allows for it and is well-used by a shitton of people. Dynamic document generation is awesome, yes I'll preach about it till my death :P.

By design, the edge case of the first-replaced type containing the second- or third-replaced type leading to issues will always exist with this design - not that I would know a way of the top of my head how to circumvent that - so I would suggest to favour the most likely and most useful version.
And in my opinion, you're much more likely to have a latex block containing code in a markdown doc than a code block containing latex syntax . But I might be wrong there.

I guess this pit is deeper than I initially thought, and opens up to the wonderful world of edge-cases. I suppose one could crawl the each document top to bottom and capture/restore it properly that way, just not sure if that is worth the hassle.

Thank you again for this amazing tool.

Sincerely,
~Gw

from obsidian-html.

Gewerd-Strauss avatar Gewerd-Strauss commented on September 20, 2024

However, in principle you could nest inline code into a codeblock the same way you do with a latex block. In that case, the above solution would fail for the same reason the master branch fails for in this issue. So, you could reorder both RestoreCodeSections() and StripCodeSections() to

Welp. Guess my brain was dead, cuz that solution is utter bogus. This would annihilate code blocks as the first two ` would be consumed to create inline code, and leaving a malformatted codeblock. So the overall order should be:

  1. Strip codeblocks
  2. Strip codelines
  3. Strip latexblocks
  4. Restore latexblocks
  5. Restore codelines
  6. Restore codeblocks

._.

from obsidian-html.

dwrolvink avatar dwrolvink commented on September 20, 2024

Did not test this as I am a bit short on time lately, but I agree with your analysis.
And if it works for you than I guess it's tested enough :)

Thanks for working this out and the PR!

from obsidian-html.

Gewerd-Strauss avatar Gewerd-Strauss commented on September 20, 2024

I can say at least that for me, running on that branch since opening this issue and PR I have not encountered an issue so far. Obviously I will be biased because I wrote the code pertaining to this; but so far I can not attest to any issues.

from obsidian-html.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.