Code Monkey home page Code Monkey logo

pdfcpu's Introduction

pdfcpu: a Go PDF processor

Build Status GoDoc Coverage Status Go Report Card Hex.pm Latest release

pdfcpu is a PDF processing library written in Go supporting encryption. It provides both an API and a CLI. Supported are all versions up to PDF 1.7 (ISO-32000).

Motivation

This is an effort to build a comprehensive PDF processing library from the ground up written in Go. Over time pdfcpu aims to support the standard range of PDF processing features and also any interesting use cases that may present themselves along the way.

     

     

 

Focus

The main focus lies on strong support for batch processing and scripting via a rich command line. At the same time pdfcpu wants to make it easy to integrate PDF processing into your Go based backend system by providing a robust command set.

Command Set

Documentation

  • The main entry point is pdfcpu.io.
  • For CLI examples also go to pdfcpu.io. There you will find explanations of all the commands and their parameters.
  • For API examples of all pdfcpu operations please refer to GoDoc.

GoDoc

Reminder

  • Always make sure your work is based on the latest commit!
  • pdfcpu is still Alpha - bugfixes are committed on the fly and will be mentioned in the next release notes.
  • Follow pdfcpu for news and release announcements.
  • For quick questions or discussions get in touch on the Gopher Slack in the #pdfcpu channel.

Demo Screencast

(using older version with a smaller command set)

asciicast

Installation

Download

Get the latest binary here.

Using GOPATH

Required go version for building: go1.13 and up

go get github.com/pdfcpu/pdfcpu/cmd/...
cd $GOPATH/src/github.com/pdfcpu/pdfcpu/cmd/pdfcpu
go install
pdfcpu version

Using Go Modules

git clone https://github.com/pdfcpu/pdfcpu
cd pdfcpu/cmd/pdfcpu
go install
pdfcpu ve

Note

We transferred this repo to the pdfcpu organisation. All links to the previous repository location are automatically redirected to the new location. However, to avoid confusion, we strongly recommend updating any existing local clones to point to the new repository URL. You can do this by using git remote on the command line:

git remote set-url origin https://github.com/pdfcpu/pdfcpu

Using Docker

docker build -t pdfcpu .
# mount current folder into container to process local files
docker run -it --mount type=bind,source="$(pwd)",target=/app pdfcpu ./pdfcpu validate -mode strict /app/pdfs/a.pdf

Contributing

What

  • Please open an issue if you find a bug or want to propose a change.
  • Feature requests - always welcome!
  • Bug fixes - always welcome!
  • PRs - anytime!
  • pdfcpu is stable but still Alpha and occasionally undergoing heavy changes.

How

  • If you want to report a bug please attach the very verbose (pdfcpu cmd -vv ...) output and ideally a test PDF that you can share.
  • Always make sure your contribution is based on the latest commit.
  • Please sign your commits.

Reporting Crashes

Unfortunately crashes do happen :( For the majority of the cases this is due to a diverse pool of PDF Writers out there and millions of PDF files using different versions waiting to be processed by pdfcpu. Sometimes these PDFs were written more than 20(!) years ago. Often there is an issue with validation - sometimes a bug in the parser. Many times even using relaxed validation with pdfcpu does not work. In these cases we need to extend relaxed validation and for this we are relying on your help. By reporting crashes you are helping to improve the stability of pdfcpu. If you happen to crash on any pdfcpu operation be it on the command line or in your Go backend these are the steps to report this:

Regardless of the pdfcpu operation, please start using the pdfcpu command line to validate your file:

pdfcpu validate -v &> crash.log

or to produce very verbose output

pdfcpu validate -vv &> crash.log

will produce what's needed to investigate a crash. Then open an issue and post crash.log or its contents. Ideally post a test PDF you can share to reproduce this. You can also email to [email protected] or if you prefer Slack you can get in touch on the Gopher slack #pdfcpu channel.

If processing your PDF with pdfcpu crashes during validation and can be opened by Adobe Reader and Mac Preview chances are we can extend relaxed validation and provide a fix. If the file in question cannot be opened by both Adobe Reader and Mac Preview we cannot help you!

Contributors

Thanks goes to these wonderful people:


Horst Rutter


haldyr


Vyacheslav


Erik Unger


Richard Wilkes


minenok-tutu


Mateusz Burniak


Dmitry Harnitski


ryarnyah


Sam Giffney

Code of Conduct

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

Disclaimer

Usage of pdfcpu assumes you know about and respect all copyrights of any PDF content you may be processing. This applies to the PDF files as such, their content and in particular all embedded resources like font files or images. Credit goes to Renee French for creating our beloved Gopher.

License

Apache-2.0

Powered By

pdfcpu's People

Contributors

hhrutter avatar ungerik avatar simepel avatar dharnitski avatar matbur avatar richardwilkes avatar ryarnyah avatar haldyr avatar

Watchers

James Cloos avatar Robin avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.