Code Monkey home page Code Monkey logo

percollate's Introduction

percollate

Percollate is a command-line tool to turn web pages into beautifully formatted PDFs. See How it works.

Example Output

Example spread from the generated PDF of a chapter in Dimensions of Colour; rendered here in black & white for a smaller image file size.

Table of Contents

Installation

๐Ÿ’ก percollate needs Node.js version 8 or later, as it uses new(ish) JavaScript syntax.

You can install percollate globally:

# using npm
npm install -g percollate

# using yarn
yarn global add percollate

To keep the package up-to-date, you can run:

# using npm, upgrading is the same command as installing
npm install -g percollate

# yarn has a separate command
yarn global upgrade --latest percollate

Usage

๐Ÿ’ก Run percollate --help for a list of available commands. For a particular command, percollate <command> --help lists all available options.

Available commands

Command What it does
percollate pdf Bundles one or more web pages into a PDF
percollate epub Not implemented yet
percollate html Not implemented yet

Available options

The pdf, epub, and html commands have these options:

Option What it does
-o, --output The path of the resulting bundle; when ommited, we derive the output file name from the title of the web page.
--individual Export each web page as an individual file.
--template Path to a custom HTML template
--style Path to a custom CSS
--css Additional CSS styles you can pass from the command-line to override the default/custom stylesheet styles

Examples

Basic PDF generation

To transform a single web page to PDF:

percollate pdf --output some.pdf https://example.com

To bundle several web pages into a single PDF, specify them as separate arguments to the command:

percollate pdf --output some.pdf https://example.com/page1 https://example.com/page2

You can use common Unix commands and keep the list of URLs in a newline-delimited text file:

cat urls.txt | xargs percollate pdf --output some.pdf

To transform several web pages into individual PDF files at once, use the --individual flag:

percollate pdf --individual --output some.pdf https://example.com/page1 https://example.com/page2

Custom page size / margins

The default page size is A5 (portrait). You can use the --css option to override it using any supported CSS size:

percollate pdf --output some.pdf --css "@page { size: A3 landscape }" http://example.com

Similarly, you can define:

  • custom margins: @page { margin: 0 }
  • the base font size: html { font-size: 10pt }

or, for that matter, any other style defined in the default / custom stylesheet.

Using a custom HTML template

โš ๏ธ TODO add example here

Using a custom CSS stylesheet

โš ๏ธ TODO add example here

Customizing the page header / footer

โš ๏ธ TODO add example here

How it works

  1. Fetch the page(s) using got
  2. Enhance the DOM using jsdom
  3. Pass the DOM through mozilla/readability to strip unnecessary elements
  4. Apply the HTML template and the print stylesheet to the resulting HTML
  5. Use puppeteer to generate a PDF from the page

Troubleshooting

On some Linux machines you'll need to install a few more Chrome dependencies before percollate works correctly. (Thanks to @ptica for sorting it out)

The percollate pdf command supports the --no-sandbox Puppeteer flag, but make sure you're aware of the implications before disabling the sandbox.

Contributing

Contributions of all kinds are welcome! See CONTRIBUTING.md for details.

See also

Here are some other projects to check out if you're interested in building books using the browser:

percollate's People

Contributors

danburzo avatar juhq avatar phenax avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.