Code Monkey home page Code Monkey logo

url-to-pdf-api's People

Contributors

aman601 avatar andreyshishkanov avatar anilkapoorwingify avatar arcatdmz avatar danielruf avatar elroadster avatar esvitaly avatar guptarohit avatar keefbaker avatar kimmobrunfeldt avatar kjagiello avatar lanre-ade avatar louim avatar marconi avatar maxstalker avatar micgro42 avatar nicky9door avatar nkimadusanka avatar onagurna avatar steakunderscore avatar tomasc avatar tranv94 avatar tsingwong avatar vanthome avatar yundifu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

url-to-pdf-api's Issues

scrollPage bug

There are some websites such as example using a special lazy loading strategy.

When users scroll quickly(<300ms) they do not load image. Just when users stop to look at the content they load image.

So, I think that we need another option (scrollInterval) to let user to test and decide the interval.

releated discussions:

puppeteer/puppeteer#338 (comment)

Thanks!

Font size decrease when pdf.width and pdf.height parameters are passed.

If the height and width parameters are passed while rendering an HTML page, it somehow reduces the font size but the size of the content boxes are not affected.
Is this the expected result, if not is there any solutions (any flag) to make sure the pdf rendering does not affect any applied styles(CSS).

Searching for maintainers

Hi,

I'm searching for a few helping hands with the maintenance. This repo is definitely on my top open source maintenance priorities and I'll continue to be a maintainer also but I haven't had enough time to do good maintenance lately. I think it's healthy for any project to have at least 2 persons with collaboration rights. If you'd like to join the effort, please respond to this issue describing a bit your background in open source.

API key authentication

Hi folks! Could someone please point me to some documentation on how to do API key authentication. There's mention of it in the README, but no instructions yet, and I didn't see anything relevant in the Puppeteer docs. Any help appreciated!

Security issues

It's easy to make Chrome display any file:// link. A couple of ways:

  • Redirect
  • window.location.href

Let's figure out if we could have a few ways in Puppeteer to block as much of these as possible. In any case, I'm quite confident that it's not possible to catch all of them. I would definitely recommend serving this API for "trusted" users, e.g. inside your organization.

Issues with header and footer templates

This issue gathers a lot of issues with PDF header and footer templates. They are not as flexible as I and apparently many others have thought.

Headers and footers are not appearing

Working example: https://url-to-pdf-api.herokuapp.com/api/render?url=https://github.com&pdf.margin.bottom=100px&pdf.displayHeaderFooter=true&pdf.footerTemplate=%3Cp%20style=%22font-size:20px%22%3EFooter%20text%3C/p%3E

Styling is not working

See puppeteer/puppeteer#2916 and puppeteer/puppeteer#2388

ERR_CERT_AUTHORITY_INVALID

In my corporation we have self-signed certs, which causes to throw errors. How do I disable SSL?

2017-10-11T20:44:32.919Z - info: [pdf-core.js] Set browser viewport..
2017-10-11T20:44:32.920Z - info: [pdf-core.js] Emulate @media screen..
2017-10-11T20:44:32.921Z - info: [pdf-core.js] Goto url http://google.com ..
2017-10-11T20:44:33.689Z - error: [pdf-core.js] Error when rendering page: Error: SSL Certificate error: ERR_CERT_AUTHORITY_INVALID
2017-10-11T20:44:33.689Z - error: [pdf-core.js] Error: SSL Certificate error: ERR_CERT_AUTHORITY_INVALID
    at NavigatorWatcher.waitForNavigation (/usr/src/app/node_modules/puppeteer/lib/NavigatorWatcher.js:73:20)
    at <anonymous>
    at process._tickCallback (internal/process/next_tick.js:188:7)
2017-10-11T20:44:33.690Z - info: [pdf-core.js] Closing browser..
2017-10-11T20:44:33.708Z - error: [error-logger.js] Request headers: host=localhost:9000, user-agent=Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:56.0) Gecko/20100101 Firefox/56.0, accept=text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8, accept-language=en-US,en;q=0.5, accept-encoding=gzip, deflate, connection=keep-alive, upgrade-insecure-requests=1
2017-10-11T20:44:33.708Z - error: [error-logger.js] Request parameters:
2017-10-11T20:44:33.709Z - error: [error-logger.js] Request body:
2017-10-11T20:44:33.710Z - error: [error-logger.js] Error: SSL Certificate error: ERR_CERT_AUTHORITY_INVALID
    at NavigatorWatcher.waitForNavigation (/usr/src/app/node_modules/puppeteer/lib/NavigatorWatcher.js:73:20)
    at <anonymous>
    at process._tickCallback (internal/process/next_tick.js:188:7) 'Error: SSL Certificate error: ERR_CERT_AUTHORITY_INVALID\n    at NavigatorWatcher.waitForNavigation (/usr/src/app/node_modules/puppeteer/lib/NavigatorWatcher.js:73:20)\n    at <anonymous>\n    at process._tickCallback (internal/process/next_tick.js:188:7)'
GET /api/render?url=http://google.com&pdf.margin.top=2cm&pdf.margin.right=2cm&pdf.margin.bottom=2cm&pdf.margin.left=2cm 500 1021.139 ms - -


Cookies support

I was having difficulties getting the cookies to be sent with my request, and I think I may have found the problem. This function here is missing cookies assignment, and therefore the resulting cookies array is always empty.

Am I missing something? Thanks for the library by the way - it's just awesome!

Font weight ignored

I have an issue, whatever font-weight property I set it's being ignored. When I open html in Chrome it looks fine, but when I generate pdf from it everything has 'regular' font weight. Has anyone experienced this issue? Is there a workaround?

Adding a footer and header on every page

Great work here -- its 2017 and generating PDFs is still unnecessarily complicated. I'm currently using wfhtmltopdf. I'd love to stop using it, and use this project as a micro service to handle all my pdf needs. However the one thing I can't figure out how to do is add a footer and/or header to every generated page. Header/footer would need to have stuff like logo, page number, warning, date, invoice number etc, so it needs to be more custom than the standard pdf.displayHeaderFooter option allows.

Does anyone have any experience with this? Is there something I'm missing? Thanks again for this awesome project.

URL and HTML issues with POST

Hi there,

I'm having trouble getting a POST request in Mithril.js to a locally hosted version of this repo to generate a PDF from the URL I pass through. The URL field is undefined on the server side.

This is what my call looks like:

m.request({
		method: "POST",
		url: "http://localhost:9000/api/render",
		headers: {
			"content-type": "application/json",
		},
		data: {
			"url": "http://www.google.com",
		},
	})
	.then(function (result) {
		try{
			console.log('Worked');
		} catch (error) {
			console.log('Error:' + error);
		}
	})
	.catch(function (result) {
		console.log('Error: ' + result);
	})

On the server side I output the opts. I get this:

{ cookies: [],
  scrollPage: false,
  emulateScreenMedia: true,
  ignoreHttpsErrors: false,
  html: {},
  viewport:
   { width: 1600,
     height: 1200,
     deviceScaleFactor: undefined,
     isMobile: undefined,
     hasTouch: undefined,
     isLandscape: undefined },
  goto:
   { waitUntil: 'networkidle',
     networkIdleTimeout: 2000,
     timeout: undefined,
     networkIdleInflight: undefined },
  pdf:
   { format: 'A4',
     printBackground: true,
     scale: undefined,
     displayHeaderFooter: undefined,
     landscape: undefined,
     pageRanges: undefined,
     width: undefined,
     height: undefined,
     margin:
      { top: undefined,
        right: undefined,
        bottom: undefined,
        left: undefined } },
  url: undefined,
  attachmentName: undefined,
  waitFor: undefined }

When I do a curl command it works as expected, html is null and url contains the expected url.

What am I doing wrong? Thanks in advance!

Crash after 'read ECONNRESET' error

Hi,

I get this error randomly when I try to generate a pdf from my local url-to-pdf.

What I get

The server crash with the following error : Error: read ECONNRESET at exports._errnoException (util.js:1018:11) at TCP.onread (net.js:568:26).
curl print curl: (52) Empty reply from server

What I do

curl -o test_.pdf -XPOST [email protected] -H"content-type: text/html" http://localhost:9000/api/render\?emulateScreenMedia=false\&goto.waitUntil\=load

Solution?

This bug only happens AFTER the pdf generation, when browser.close() is called, but I don't know if this is caused by puppeteer closing its connexion to chrome, or the connexion to one of the assets of the page. Because this error happens after the pdf generation, I'm inclined to ignore it, and it can be done by adding a callback on process.on('uncaughtException', (error) => {}), but I'm not sure that's the correct thing to do, but for now it's the only solution I can provide.

The html file I use

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>Test</title>
  <!-- Normalize or reset CSS with your favorite library -->
  <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/normalize/3.0.3/normalize.css">

  <!-- Load paper.css for happy printing -->
  <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/paper-css/0.2.3/paper.css">
  <style>
    @page { 
      size: A4; 
    }
    img {
      display: block;
      position: absolute;
    }

    img:nth-of-type(1) {
      left: 200px;
      top: 200px;
      transform: rotate(30deg);
    }
    img:nth-of-type(2) {
      left: 10%;
      top: 70%;
      transform: rotate(200deg);
    }
    img:nth-of-type(2) {
      float: right;
    }
  </style>
</head>
<body class="A4">
  <section class="sheet">
    <h1>Lorem ipsum dolor sit amet, consectetur adipisicing elit. Cum, laboriosam!</h1>
    <p>Lorem ipsum dolor sit amet, <u>consectetur</u> adipisicing elit. <em>Officia</em> <strong>aspernatur sed</strong> <i>quis</i> veniam! Itaque fugiat voluptas rerum necessitatibus iste, <b>dolores id eligendi minus! <i>Velit <u>alias</u></i> quos</b> , deleniti optio quod numquam perspiciatis sequi. Hic autem omnis non ipsam odio. Sit nostrum officia, ea officiis corporis tempore ut illum minus placeat repellat similique natus facere iusto aperiam rerum magni inventore in vero error, quisquam nihil dolore culpa optio necessitatibus, dicta? Sit quos enim, id quidem ea amet voluptas vitae odit sequi, ex aliquid commodi illum aperiam odio suscipit reiciendis</p>
    <img src="https://placehold.it/400x400" alt="placeholder">
    <img src="https://placehold.it/400x400" alt="placeholder">
    <img src="https://placehold.it/400x400" alt="placeholder">
    <img src="https://placehold.it/400x400" alt="placeholder">
    <img src="https://placehold.it/400x400" alt="placeholder">
  </section>
  <section class="sheet">
    <h1>Such wow</h1>
    <h2>Such wow</h2>
    <h3>Such wow</h3>
    <h4>Such wow</h4>
    <h5>Such wow</h5>
    <h6>Such wow</h6>
    <p style="text-align: left">Lorem ipsum dolor sit amet, consectetur adipisicing elit. Minima, tempora? Lorem ipsum dolor sit amet, consectetur adipisicing elit. Molestiae ipsa inventore laborum rem deserunt placeat, praesentium soluta exercitationem corporis at, voluptatibus id atque amet voluptate mollitia nam sunt nisi, excepturi facilis nemo! Maiores deserunt qui, quia soluta culpa accusantium distinctio numquam eaque asperiores maxime suscipit, iusto inventore. Adipisci, quasi corporis!</p>
    <p style="text-align: right">Lorem ipsum dolor sit amet, consectetur adipisicing elit. Minima, tempora? Lorem ipsum dolor sit amet, consectetur adipisicing elit. Laborum, suscipit? Officia rem dolorum, quisquam autem expedita ea odio aliquam dicta amet corporis voluptatum ipsam sequi ipsa accusantium enim molestiae nemo, qui, et odit quod corrupti ab? Odio, quisquam voluptatem aperiam totam illum repellendus temporibus harum dolores, laboriosam alias, doloremque et?</p>
    <p style="text-align: center">Lorem ipsum dolor sit amet, consectetur adipisicing elit. Minima, tempora? Lorem ipsum dolor sit amet, consectetur adipisicing elit. Sapiente ipsam consectetur omnis ut repellendus, amet commodi minus fugit consequatur recusandae necessitatibus explicabo quasi nostrum eveniet dolores similique eligendi, expedita blanditiis doloremque nemo nobis. Sint aspernatur, mollitia expedita nulla est, rerum aliquam error. Provident saepe similique, dignissimos quia explicabo ab, nihil.</p>
    <p style="text-align: justify;">Lorem ipsum dolor sit amet, consectetur adipisicing elit. Minima, tempora? Lorem ipsum dolor sit amet, consectetur adipisicing elit. Sapiente ipsam consectetur omnis ut repellendus, amet commodi minus fugit consequatur recusandae necessitatibus explicabo quasi nostrum eveniet dolores similique eligendi, expedita blanditiis doloremque nemo nobis. Sint aspernatur, mollitia expedita nulla est, rerum aliquam error. Provident saepe similique, dignissimos quia explicabo ab, nihil.</p>
    <h1 style="transform: rotate(180deg);text-align: center;">WOOOOOOOOOOOOOOW</h1>
    <h1 style="transform: rotate(50deg);text-align: center;">AMAZING</h1>
    <h1 style="transform: rotate(80deg);text-align: center;">WOOOOOOOOOOOOOOW</h1>
    <h1 style="transform: rotate(300deg);text-align: center;">WOOOOOOOOOOOOOOW</h1>
    <h1 style="transform: rotate(260deg);text-align: center;">WOOOOOOOOOOOOOOW</h1>
    <h1 style="transform: rotate(120deg);text-align: center;">WOOOOOOOOOOOOOOW</h1>
    <h1 style="transform: rotate(190deg);text-align: center;">WOOOOOOOOOOOOOOW</h1>
  </section>

</body>
</html>

Add support to 0 timeout requests

As written inside README.md, you can pass the value 0 to goto.timeout when performing a request to /api/render. However, this is supported from puppeteer 0.12.0, while this project currently uses 0.11.0.
I've seen that there is a branch for updating to the newest puppeteer, so this is maybe a non-issue (but the README could be updated before we actually merge the new branch)

Improve error handling

Puppeteer await calls are not throwing all errors. Some errors can only be catched from page.on('error', cb) callback. We should be able to provide these errors better in the responses. Currently almost all errors except validation errors are 500 Internal Server Error. Only place to see what happened is application logs.

options in .env not used

When I alter the options in .env they are not used:

export NODE_ENV=development
export PORT=9990
export ALLOW_HTTP=true

When I use them as a prefix for the start command it works just fine:


ALLOW_HTTP=true PORT=9990 npm start

What am I doing wrong?

BTW, very nice piece of software!

grayscale

What is the way if I want convert my html to grayscale pdf ?

Fails to navigate on a non-.com

We have an internal site that I'm trying to grab PDFS from on the fly. The app works fine on any public url, but not on our internal.

2017-10-12T14:48:19.108Z - info: [pdf-core.js] Set browser viewport..
2017-10-12T14:48:19.109Z - info: [pdf-core.js] Emulate @media screen..
2017-10-12T14:48:19.109Z - info: [pdf-core.js] Goto url https://cef.erwf.nin.asn/ ..
2017-10-12T14:48:21.395Z - error: [pdf-core.js] Error when rendering page: Error: Failed to navigate: https://cef.erwf.nin.asn/
2017-10-12T14:48:21.396Z - error: [pdf-core.js] Error: Failed to navigate: https://cef.erwf.nin.asn/
    at Page.goto (/usr/src/app/node_modules/puppeteer/lib/Page.js:390:13)
    at <anonymous>
2017-10-12T14:48:21.396Z - info: [pdf-core.js] Closing browser..
2017-10-12T14:48:21.407Z - error: [error-logger.js] Request headers: host=localhost:9000, user-agent=Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:56.0) Gecko/20100101 Firefox/56.0, accept=text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8, accept-language=en-US,en;q=0.5, accept-encoding=gzip, deflate, connection=keep-alive, upgrade-insecure-requests=1
2017-10-12T14:48:21.407Z - error: [error-logger.js] Request parameters:
2017-10-12T14:48:21.407Z - error: [error-logger.js] Request body:
2017-10-12T14:48:21.408Z - error: [error-logger.js] Error: Failed to navigate: https://cef.erwf.nin.asn/
    at Page.goto (/usr/src/app/node_modules/puppeteer/lib/Page.js:390:13)
    at <anonymous> 'Error: Failed to navigate: https://cef.erwf.nin.asn/\n    at Page.goto (/usr/src/app/node_modules/puppeteer/lib/Page.js:390:13)\n    at <anonymous>'
GET /api/render?url=https://cef.erwf.nin.asn/ 500 2484.461 ms - -

random errors when rendering pdf from html via POST

First - thank you so much for creating and working on this project.

I've deployed to Heroku. Most of the time pdf is generated, sometimes there is an error and entire node server crashes.

Here is the log: url_to_pdf_api_error-01-25-2018.log

Is there a good way to debug this problem? Currently it crashes around ~20% of the time. I was running on "hobby" initially, but had same results on 1x and 2x instance types.

Support cookies

I would not want to pass the hosted version auth cookies but locally I would like to pass in a url and a cookie to be set. This would allow me to generate, locally, pdfs of my authenticated pages.

Thanks. It looks neat.

Becker

Issue passing more than one parameter.

If i try to pass more than one parameter, i get an error on the second parameter.
Example: https://urltopdf2.herokuapp.com/api/render?url=https://server1.outsystemscloud.com/automatedterritoryas/PDFEmail.aspx?Tenantid=109&Territoryid=564

If i browse to the url, works no problem. When i try to use url-to-pdf-api, i get the following error:
{"status":400,"statusText":"Bad Request","errors":[{"field":["Territoryid"],"location":"query","messages":[""Territoryid" is not allowed"],"types":["object.allowUnknown"]}]}

Again, if i leave off the last &Territoryid=564 it works, no error. Add it, error.

how to generate a PDF with automatic height?

I use this Puppeteer microservice to generate receipts in PDF. For each receipt, width is always the same, but height changes, according to the article count in the order.

For now, I'm using the article count to approximate the required height for my receipt. It kind of works, but it's not perfect and is a dirty way to do.
Is there way to tell Puppeteer API : "Please automatically find the right PDF height, according to the HTML body height, in order to generate a perfectly sized PDF" ?

CORS config is missing

CORS_ORIGIN is missing in config.js, and it is used in app.js:

  const corsOpts = {
    origin: config.CORS_ORIGIN, //undefined
    methods: ['GET', 'POST', 'PUT', 'DELETE', 'OPTIONS', 'HEAD', 'PATCH'],
  };

Feature request: Support rendering images

Hi,

First of all, thanks for this awesome project. It seems to be really well thought-out, so thank you for your efforts. I also really like the ability to render logged in pages by setting a cookie in the POST request.

Since you are using puppeteer, which also supports rendering pages to images via "screenshot", it would be possible to render images as well. Is this something you're interested in? We have some users which would like this, for example for dashboards that are displayed on a monitors.

Internal Server Error

Some requests to the demo Heroku app return:

{
  status: 500,
  statusText: "Internal Server Error",
  messages: [
    "Internal Server Error"
  ]
}

Cloudflare and 301 redirects

Hello does this software has been tested to handle 301 requests?
Cloudflare does that and other softwares don't seem to follow up.

Only HTTPS allowed?

http://localhost:9000/api/render?url=http://google.com


2017-10-05T16:05:58.491Z - warn: [error-logger.js] Request headers: host=localhost:9000, user-agent=Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:56.0) Gecko/20100101 Firefox/56.0, accept=text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8, accept-language=en-US,en;q=0.5, accept-encoding=gzip, deflate, connection=keep-alive, upgrade-insecure-requests=1
2017-10-05T16:05:58.491Z - warn: [error-logger.js] Request parameters:
2017-10-05T16:05:58.491Z - warn: [error-logger.js] Request body:
2017-10-05T16:05:58.491Z - warn: [error-logger.js] Error: Only HTTPS allowed.
GET /api/render?url=https://google.com 403 0.824 ms - 74

cookies

i am confused in assigning cooking in api. could anyone help me. I have 3 cookies
eg - Evnetid = 6235765; sessionid = jshdak; documentID= sjdh; how to enter this in api.

i read the document and try to put the values but getting error every time could anyone help please?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.