Code Monkey home page Code Monkey logo

spdf's Introduction

sPDF

sPDF ( pronounced speedy-f ) is a Scala library that makes it super easy to create complex PDFs from plain old HTML, CSS and Javascript.

On the backend it uses wkhtmltopdf which renders HTML using Webkit.

sPDF is heavily inspired by Ruby's PdfKit gem.

The main features of sPDF are:

  • full support of wkhtmltopdf extended parameters (see the source of the PdfConfig trait)
  • can read HTML from several sources: java.io.File, java.io.InputStream, java.net.URL, scala.xml.Elem, and String
  • can write PDFs to File and OutputStream

The source HTML can reference to images and stylesheet files as long as the URLs point to the absolute path of the source file. It's also possible to embed javascript code in the pages, wkhtmltopdf will wait for the document ready event before generating the PDF.

Installation

Add the following to your sbt build for Scala 2.10, 2.11 and 2.12:

libraryDependencies += "io.github.cloudify" %% "spdf" % "1.4.0"

Add the following to your sbt build for Scala 2.9:

libraryDependencies += "io.github.cloudify" %% "spdf" % "1.3.1"

Usage

	import io.github.cloudify.scala.spdf._
	import java.io._
	import java.net._

	// Create a new Pdf converter with a custom configuration
	// run `wkhtmltopdf --extended-help` for a full list of options
	val pdf = Pdf(new PdfConfig {
	  orientation := Landscape
	  pageSize := "Letter"
	  marginTop := "1in"
	  marginBottom := "1in"
	  marginLeft := "1in"
	  marginRight := "1in"
	})

	val page = <html><body><h1>Hello World</h1></body></html>

	// Save the PDF generated from the above HTML into a Byte Array
	val outputStream = new ByteArrayOutputStream
	pdf.run(page, outputStream)

	// Save the PDF of Google's homepage into a file
	pdf.run(new URL("http://www.google.com"), new File("google.pdf"))

If you want to use sPDF in headless mode on debian you'll need to call to wkhtmltopdf through a virtualizer like xvfb-run. This is because wkhtmltopdf does not support running in headless mode on debian through the apt package. To use sPDF in this kind of environment you need to use WrappedPdf instead of Pdf. For Example:

	import io.github.cloudify.scala.spdf._
	import java.io._
	import java.net._

	// Create a new Pdf converter with a custom configuration
	// run `wkhtmltopdf --extended-help` for a full list of options
	val pdf = WrappedPdf(Seq("xvfb-run", "wkhtmltopdf"), new PdfConfig {
	  orientation := Landscape
	  pageSize := "Letter"
	  marginTop := "1in"
	  marginBottom := "1in"
	  marginLeft := "1in"
	  marginRight := "1in"
	})

	val page = <html><body><h1>Hello World</h1></body></html>

	// Save the PDF generated from the above HTML into a Byte Array
	val outputStream = new ByteArrayOutputStream
	pdf.run(page, outputStream)

	// Save the PDF of Google's homepage into a file
	pdf.run(new URL("http://www.google.com"), new File("google.pdf"))

Installing wkhtmltopdf

Visit the wkhtmltopdf downloads page and install the appropriate package for your platform.

Troubleshooting

NoExecutableException

Make sure wkhtmltopdf is installed and your JVM is running with the correct PATH environment variable.

If that doesn't work you can manually set the path to wkhtmltopdf when you create a new Pdf instance:

val pdf = Pdf("/opt/bin/wkhtmltopdf", PdfConfig.default)

Resources aren't included in the PDF

Images, CSS, or JavaScript does not seem to be downloading correctly in the PDF. This is due to the fact that wkhtmltopdf does not know where to find those files. Make sure you are using absolute paths (start with forward slash) to your resources. If you are using PDFKit to generate PDFs from a raw HTML source make sure you use complete paths (either file paths or urls including the domain).

Notes

Asynchronous conversion

sPDF relyies on Scala's scala.sys.process.Process class to execute wkhtmltopdf and pipe input/output data.

The execution of wkhtmltopdf and thus the conversion to PDF is blocking. If you need the processing to be asynchronous you can wrap the call inside a Future.

val pdf = Pdf(PdfConfig.default)

val result = Future { pdf.run(new URL("http://www.google.com"), new File("google.pdf")) }

Contributing

  • Fork the project.
  • Make your feature addition or bug fix.
  • Add tests for it. This is important so I don't break it in a future version unintentionally.
  • Commit, do not mess with build settings, version, or history.
  • Send me a pull request. Bonus points for topic branches.

Release / Publish

  • release cross with-defaults
  • check out released version
  • publishSigned
  • sonatypeRelease

Roadmap

  • Full support for extended options
  • Full support for input types
  • Streaming API (with scalaz-stream)
  • Simplified API with implicits
  • Integration with Play for streaming PDFs in HTTP responses

Copyright

Copyright (c) 2013, 2014 Federico Feroldi. See LICENSE for details.

spdf's People

Contributors

andrewmee avatar ashkulz avatar cloudify avatar cvrabie avatar jamarisi avatar josephpconley avatar jsnrth avatar lustefaniak avatar masgo avatar notbobthebuilder avatar olkinn avatar wvandrunen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spdf's Issues

PdfConfig findExecutable does not find the installed wkhtmltopdf

Hello,

I use "yum install" and the package wkhtmltopdf is installed in /usr/local/bin.
A manual "which wkhtmltopdf" gives me /usr/local/bin/wkhtmltopdf where PdfConfig findExecutable seems (if I understand the code) to find nothing.
Consequently I cannot render/create pdf.
Do you have an idea about the problem and how I can solve it?

I don't know if the next information is relevant but I would like to add that the error states:
No wkhtmltopdf executable found at /sbin:/bin:/usr/sbin:/usr/bin (only those 4)
where a echo $PATH gives
/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin: ...

Thanks

Gerald

Any plans to integrate with wkhtmltopdf without requiring a binary installed on the system?

Are there any plans to integrate with wkhtmltopdf as a renderer without requiring the user to have wkhtmltopdf installed on their system?

I understand that wkhtmltopdf isn't a native java library, but it would make my life (and probably other developers' too) much easier if our applications didn't need to make (slow) system calls out to a binary in order to use this library.

Support for Scala 2.11

It would be great if this could be updated to support Scala 2.11. I'll fork the project and see where I can get.

Strange offset accumulation

So I solved my problems by using for each page a div with the same size as the page.

I have the issue now, that there seems to be still some padding or margin between the divs that is being added on every page so that after 20 pages the page content has an offset of some pixels. I have all css margins and paddings of the page divs set to zero.
I also set all margings, header and footer spacings of the PDF configuration to zero.

Any idea somebody?

Thanks and best

Support PDFA

I have a need for generating PDFA, am I correct in thinking sPDF doesn't support this format? If not is it possible to add this as an enhancement?

Defensive API: Killing Processes and Future/ Watchdog Implementation.

I'm writing a Java OSGi wrapper via sPDF and i'd like to guard against denial of server and bugs in wkhtmltopdf. I think the there should be some defensive elements to the API.

The PDF Class could do with a run method that returns a handle to the Process object to enable a watchdog to kill the process.

Additionally Future and watchdog implementations would be handy.

[RuntimeException: no exit code: process destroyed.] on pdf.run command

On Windows 8, I try to run below code:

val pdf = Pdf(new PdfConfig {
    marginTop := "0.3in"
    marginBottom := "0.3in"
    marginLeft := "0in"
    marginRight := "0in"
})      
val page: String = "<div class='printpage'>"+userData.html+"</div>"
val baseURL: String = Play.application.path+"/EmailHistory/"+username+"_email_temp.pdf"
val f = new FileOutputStream(baseURL)
// Save the PDF generated from the above HTML into a Byte Array
val outputStream = new ByteArrayOutputStream
pdf.run(page, outputStream)
outputStream.writeTo(f)

That gives me [RuntimeException: no exit code: process destroyed.] on pdf.run(page, outputStream) command.. Is there any way to fix that?

Note: I don't have any problem running that code on my Mac.

Configure DPI

I'm trying to configure the DPI option but can't find any option to do so.
Is there a way to add your own parameters by overriding something?
Currently PdfConfig.toParameters is hardcoded so I can't change what parameters are used.
Maybe adding aval custom: Seq[String] = Seq() that can be overridden in the PdfConfig?

run() generates the PDF but does not return

In my setup (Akka, OSX) the run method generates the PDF successfully but never returns. Does anybody have an idea what this could be about?

  val pdf = Pdf(new PdfConfig {
    orientation := Landscape
    pageSize := "Letter"
    marginTop := "1in"
    marginBottom := "1in"
    marginLeft := "1in"
    marginRight := "1in"
  })

  val page = <html><body><h1>Hello World</h1><img src="/tmp/SDA879723DASD/image0.jpg" height="300"/><img src="/tmp/SDA879723DASD/image1.jpg" height="300"/><img src="/tmp/SDA879723DASD/image2.jpg" height="300"/></body></html>
  pdf.run(page, new File(task.tempDirPath+"/PrintJob.pdf"))

  println("PDF generation is done")

So the print statement is never executed. But the PDF generation is complete, I can open it and the file size stays constant / does not grow further.

Support for multiple HTML files to single PDF

Hi @cloudify,

I see that the wkhtmltopdf supports multiple HTML files as input

$ wkhtmltopdf --extended-help
Name:
  wkhtmltopdf 0.9.9

Synopsis:
  wkhtmltopdf [OPTIONS]... <input file> [More input files] <output file>

Description:
  Converts one or more HTML pages into a PDF document, *not* using wkhtmltopdf
  patched qt.

So basically I can pass a list of different HTML inputs to be rendered in one PDF.
I was wondering if this is also supported in this library but I suppose it's not (I haven't found it anyway).

So what I would do is basically something like

val pdf = Pdf(PdfConfig.default)
val os = new ByteArrayOutputStream
pdf.run(Seq(template1, template2, template3), os)

to be able to generate the correct command line.

What do you think? Is that possible?

Thanks

WKhtmltopdf not able to run as headless in Debian OS

Hi
I was not able to run sPDF in Ubuntu 18.04 as WKHtmltopdf is unable to run in Debian as headless. The suggested approach was to use xvfb.
Could sPDF be improved to allow it to run with xvfb? Below is a sample of the command
xvfb-run -- /usr/bin/wkhtmltopdf --lowquality http://www.google.com google.pdf

I did a fork of the project. I can't a good place to expand the api, hence i did a hardcode to add "-- /usr/bin/wkhtmlpdf" as the argument, where the path is "xvfb-run".

Any experience with css metrics?

Hi again,

does anybody have experience when it comes to css metrics in the html?

I just try to center a div using cm as metric, but is wont work. I have a page size of 20x20cm as set by

    pageWidth := "200"
    pageHeight := "200"

and position the div with

but it is not in the middle. I tried already setting the page size for print media with

<style type='text/css'>@media print {body {margin: 0;}@page{size: 20cm 20cm; margin:0cm;}}</style>

but it did not help.

After searching a lot it seems that this is a general problem of wkhtmltopdf. I am just wondering if anybody here knows a solution to this?

Thanks
Ole

Tests time out some times

The test that hangs is OutputStreamDestinationDocument / "pipe process STDOUT into destination stream"

wkhtmltopdf --print-media-type flag

Is there support for the --print-media-type flag?

      --print-media-type              Use print media-type instead of screen

I didn't see it in the list of PdfConfig parameters.

-j

PDF generation fails silently

This library works on my mac just fine.

But, on, ubuntu 14.04, I have it installed like this:

$ which wkhtmltopdf
/usr/bin/wkhtmltopdf

$wkhtmltopdf -V
Name:
  wkhtmltopdf 0.9.9
...

I am using it like this:

val html: String = "<html> .... "
val pdf = Pdf("/usr/bin/wkhtmltopdf", PdfConfig.default)
val destination = File.createTempFile("html2pdf", ".pdf")
val exitCode = pdf.run(html, destination)
if (exitCode != 0) Logger.error(s"PDF generation task exited with non-zero exitCode=$exitCode")
if (destination.length() == 0) Logger.warn(s"Generated PDF is empty")

But, I see that exitCode is 1 and the generated PDF is empty.
What's wrong? Why is it silently failing? It does not print any other helpful debug info at all! How can I make it print debug info and underlying error?

Support for --allow option

Not seeing this as a member of the PdfConfig class. From wkhtmltopdf help:

 --allow <path>                  Allow the file or files from the specified
                                          folder to be loaded (repeatable)

Needed to load images, etc. Am I missing something or is this unsupported?

QXcbConnection: Could not connect to display

I'm using a Gitlab runner for CI. When I run the tests to generate a PDF I run into the error:

QXcbConnection: Could not connect to display

This causes the output stream to be empty, that is to say, this test fails:

result.toByteArray.size must not be 0

Dockerfile:

FROM java:8

RUN apt-get update && apt-get -y install apt-transport-https \
    && echo "deb https://dl.bintray.com/sbt/debian /" | tee -a /etc/apt/sources.list.d/sbt.list \
    && apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 2EE0EA64E40A89B84B2DF73499E82A75642AC823 \
    && apt-get update \
    && apt-get -y install sbt \

    # Cleanup
    && apt-get clean \
    && rm -rf /var/lib/apt/lists \

Build job:

test:
  image: <private>/docker:sbt
  stage: test
  services:
  - mysql:5.6
  variables:
    MYSQL_DATABASE: "<private>"
    MYSQL_ROOT_PASSWORD: "<private>"
  script:
  - apt-get update
  - apt-get install -y --no-install-recommends wkhtmltopdf
  - sbt clean coverage test coverageReport
  artifacts:
    expire_in: 1d
    paths:
    - target/scala-2.11/scoverage-report/

Please let me know if you need more information.

Table of Content not working

The command line parameter for table of content is created as --toc.
According to the wkhtmltopdf dokumentation it should be created as toc.

Table Of Contents:
A table of contents can be added to the document by adding a toc object to the
command line. For example:

wkhtmltopdf toc https://doc.qt.io/archives/qt-4.8/qstring.html qstring.pdf

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.