Code Monkey home page Code Monkey logo

markdown-pdf's Introduction

Module markdown-pdf

GitHub Workflow Status GitHub Workflow Status Codacy Badge Codacy Badge PyPI - Downloads

The free, open source Python module markdown-pdf will create a PDF file from your content in markdown format.

When creating a PDF file you can:

  • Use UTF-8 encoded text in markdown in any language
  • Embed images used in markdown
  • Break text into pages in the desired order
  • Create a TableOfContents (bookmarks) from markdown headings
  • Tune the necessary elements using your CSS code
  • Use different page sizes within single pdf
  • Create tables in markdown

The module utilizes the functions of two great libraries.

Installation

pip install markdown-pdf

Usage

Create a pdf with TOC (bookmarks) from headings up to level 2.

from markdown_pdf import MarkdownPdf

pdf = MarkdownPdf(toc_level=2)

Add the first section to the pdf. The title is not included in the table of contents.

from markdown_pdf import Section

pdf.add_section(Section("# Title\n", toc=False))

Add a second section. In the pdf file it starts on a new page. The title is centered using CSS, included in the table of contents of the pdf file, and an image from the file img/python.png is embedded on the page.

pdf.add_section(
  Section("# Head1\n\n![python](img/python.png)\n\nbody\n"),
  user_css="h1 {text-align:center;}"
)

Add a third section. Two headings of different levels from this section are included in the TOC of the pdf file. The section has landscape orientation of A4 pages.

pdf.add_section(Section("## Head2\n\n### Head3\n\n", paper_size="A4-L"))

Add a fourth section with a table.

text = """# Section with Table

|TableHeader1|TableHeader2|
|--|--|
|Text1|Text2|
|ListCell|<ul><li>FirstBullet</li><li>SecondBullet</li></ul>|
"""

pdf.add_section(Section(text))

Set the properties of the pdf document.

pdf.meta["title"] = "User Guide"
pdf.meta["author"] = "Vitaly Bogomolov"

Save to file.

pdf.save("guide.pdf")

Pdf

Settings and options

The Section class defines a portion of markdown data, which is processed according to the same rules. The next Section data starts on a new page.

The Section class can set the following attributes.

  • toc: whether to include the headers <h1> - <h6> of this section in the TOC. Default is True.
  • root: the name of the root directory from which the image file paths starts in markdown. Default ".".
  • paper_size: name of paper size, as described here. Default "A4".
  • borders: size of borders. Default (36, 36, -36, -36).

The following document properties are available for assignment (dictionary MarkdownPdf.meta) with the default values indicated.

  • creationDate: current date
  • modDate: current date
  • creator: "PyMuPDF library: https://pypi.org/project/PyMuPDF"
  • producer: ""
  • title: ""
  • author: ""
  • subject: ""
  • keywords: ""

Example

As an example, you can download the pdf file created from this md file. This Python script was used to create the PDF file.

markdown-pdf's People

Contributors

vb64 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

ngaurav ms-jahan

markdown-pdf's Issues

TypeError: '>' not supported between instances of 'str' and 'int'

Version: markdown_pdf 1.2 (pip install markdown-pdf, with Python 3.12.3 on AMD64)

Error while trying to save a PDF:

Traceback (most recent call last):
  File "app.py", line 64, in <module>
    pdf.save("result.pdf")
  File "/home/user/app/.venv/lib/python3.12/site-packages/markdown_pdf/__init__.py", line 82, in save
    if self.toc_level > 0:
       ^^^^^^^^^^^^^^^^^^
TypeError: '>' not supported between instances of 'str' and 'int'

Steps to reproduce:

app.py:

from markdown_pdf import MarkdownPdf

[OTHER CODE]

pdf = MarkdownPdf(result)
pdf.meta["title"] = "Title"
pdf.meta["author"] = "Author"
pdf.save("result.pdf")

$ python app.py

Problems with python3.12

Hi! I recently upgraded to python3.12 and tried to install markdown-pdf the usual way with pip (I'm on MacOS by the way).

Unfortunately the installation is stuck for quite some time on this step:

user@device ~ % pip3 install markdown-pdf
Collecting markdown-pdf
  Using cached markdown_pdf-1.1-py3-none-any.whl.metadata (3.3 kB)
Collecting PyMuPDF==1.23.3 (from markdown-pdf)
  Using cached PyMuPDF-1.23.3.tar.gz (60.5 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... \

after 5min or so the installation fails with a huge error log that exceeds my console. The last few files look like this:

                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/opt/homebrew/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 152, in prepare_metadata_for_build_wheel
          whl_basename = backend.build_wheel(metadata_directory, config_settings)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/private/var/folders/3k/4k2883bs10n32h6f0ybvcmt80000gp/T/pip-install-jrpjs54o/pymupdf_b8db2ebb1db84de899638ee47b15d7ee/pipcl.py", line 580, in build_wheel
          items = self._call_fn_build(config_settings)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/private/var/folders/3k/4k2883bs10n32h6f0ybvcmt80000gp/T/pip-install-jrpjs54o/pymupdf_b8db2ebb1db84de899638ee47b15d7ee/pipcl.py", line 732, in _call_fn_build
          ret = self.fn_build()
                ^^^^^^^^^^^^^^^
        File "/private/var/folders/3k/4k2883bs10n32h6f0ybvcmt80000gp/T/pip-install-jrpjs54o/pymupdf_b8db2ebb1db84de899638ee47b15d7ee/setup.py", line 692, in build
          mupdf_build_dir = build_mupdf_unix( mupdf_local, env_extra, build_type)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/private/var/folders/3k/4k2883bs10n32h6f0ybvcmt80000gp/T/pip-install-jrpjs54o/pymupdf_b8db2ebb1db84de899638ee47b15d7ee/setup.py", line 928, in build_mupdf_unix
          subprocess.run( command, shell=True, check=True)
        File "/opt/homebrew/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/subprocess.py", line 571, in run
          raise CalledProcessError(retcode, process.args,
      subprocess.CalledProcessError: Command 'cd /private/var/folders/3k/4k2883bs10n32h6f0ybvcmt80000gp/T/pip-install-jrpjs54o/pymupdf_b8db2ebb1db84de899638ee47b15d7ee/mupdf-1.23.2-source && XCFLAGS=-DTOFU_CJK_EXT /opt/homebrew/Cellar/[email protected]/3.12.3/bin/python3.12 ./scripts/mupdfwrap.py -d build/PyMuPDF-arm64-shared-tesseract-release -b all && echo /private/var/folders/3k/4k2883bs10n32h6f0ybvcmt80000gp/T/pip-install-jrpjs54o/pymupdf_b8db2ebb1db84de899638ee47b15d7ee/mupdf-1.23.2-source/build/PyMuPDF-arm64-shared-tesseract-release: && ls -l /private/var/folders/3k/4k2883bs10n32h6f0ybvcmt80000gp/T/pip-install-jrpjs54o/pymupdf_b8db2ebb1db84de899638ee47b15d7ee/mupdf-1.23.2-source/build/PyMuPDF-arm64-shared-tesseract-release' returned non-zero exit status 1.
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

The problem does not occur when using other pip packages. I'm happy to provide more logs if needed :)

Adding a header with only a space causes a segmentation fault

Thanks for this very helpful package!

I just noticed a very strange bug occuring due to pymupdf.

The following sample code causes a segmentation fault:

from markdown_pdf import MarkdownPdf, Section
pdf = MarkdownPdf(toc_level=2)
pdf.add_section(Section("# "))

Result with gdb:

Starting program: ./venv/bin/python test.py

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.ubuntu.com>
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Downloading separate debug info for ./venv/lib/python3.11/site-packages/fitz/_extra.cpython-311-x86_64-linux-gnu.so
Downloading separate debug info for ./venv/lib/python3.11/site-packages/fitz/libmupdf.so.24.1                                                                                 
Downloading separate debug info for ./venv/lib/python3.11/site-packages/fitz/libmupdfcpp.so.24.1                                                                              
Downloading separate debug info for ./venv/lib/python3.11/site-packages/fitz/_mupdf.so                                                                                        
                                                                                                                                                                                                                        
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff56e5ad2 in ?? () from ./venv/lib/python3.11/site-packages/fitz/libmupdf.so.24.1

The bug appears to be in the fitz package but I'm unsure where to file it.

Cheers

Custom CSS

Hi, sorry if I'm missing the obvious here. How would I feed in custom CSS (I really just want to centre img's)? Cheers for any help.

Support highlight?

When I output the entire markdown content as a pdf, there is no block highlighting and it shows the incomplete markdown, with a portion on the right side being cut

from markdown_pdf import MarkdownPdf
from markdown_pdf import Section


def markdown_pdf_write(file_fullname, file_contents):
    pdf = MarkdownPdf(toc_level=2)
    pdf.add_section(Section(file_contents, toc=False))

    pdf.save(file_fullname)

Tables support

Markdown:

#header1

|TableHeader1|TableHeader2|
|--|--|
|Text1|Details 1|
|ListCell|<ul><li>FirstBullet</li><li>SecondBullet</li></ul>|

Render as:

image

Must be:

image

Images in markdown do not get pulled in

I am using markdown-pdf to pull in several existing markdown files with embedded images and write them to a single pdf. The separate markdowns display the images (with either a relative or absolute path) correctly. But, when I read them into the library with pdf.add_section the markdown comes in fine and converts to a pdf file but the image is not included.
Code:
`from markdown_pdf import MarkdownPdf
from markdown_pdf import Section

create pdf

pdf = MarkdownPdf(toc_level=2)

add section

pdf.add_section(Section("# Catchment ID 44193\n"))

add 2nd section from markdown file

md = open('./markdown/Intro.md', 'r', newline='', encoding='utf-8-sig').read()
pdf.add_section(Section(md))

set pdf properties

pdf.meta["title"] = "LOCA Report"
pdf.meta["author"] = "Tyson Broad"

save pdf

pdf.save("./src/python/md2pdf_test3.pdf")
`
Image of the local markdown displaying image correctly:
image

Problematic markdown text attached.
Intro.md
Output PDF attached.
md2pdf_test3.pdf

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.