Code Monkey home page Code Monkey logo

pdf-to-html's Introduction

Build Status Coverage Status

PDF to HTML PHP Class

This class brought to you so you can use php and poppler-utils convert your pdf files to html file

Important Notes

Please see how to use below, since it's really upgraded and things in this package has already changed.

Installation

When you are in your active directory apps, you can just run this command to add this package on your app

	composer require gufy/pdftohtml-php:~2

Or add this package to your composer.json

{
	"gufy/pdftohtml-php":"~2"
}

Requirements

  1. Poppler-Utils (if you are using Ubuntu Distro, just install it from apt ) sudo apt-get install poppler-utils
  2. PHP Configuration with shell access enabled

Usage

Here is the sample.

<?php
// if you are using composer, just use this
include 'vendor/autoload.php';

// initiate
$pdf = new Gufy\PdfToHtml\Pdf('file.pdf');

// convert to html string
$html = $pdf->html();

// convert a specific page to html string
$page = $pdf->html(3);

// convert to html and return it as [Dom Object](https://github.com/paquettg/php-html-parser)
$dom = $pdf->getDom();

// check if your pdf has more than one pages
$total_pages = $pdf->getPages();

// Your pdf happen to have more than one pages and you want to go another page? Got it. use this command to change the current page to page 3
$dom->goToPage(3);

// and then you can do as you please with that dom, you can find any element you want
$paragraphs = $dom->find('body > p');

// change pdftohtml bin location
\Gufy\PdfToHtml\Config::set('pdftohtml.bin', '/usr/local/bin/pdftohtml');

// change pdfinfo bin location
\Gufy\PdfToHtml\Config::set('pdfinfo.bin', '/usr/local/bin/pdfinfo');
?>

###Passing options to getDOM By default getDom() extracts all images and creates a html file per page. You can pass options when extracting html:

<?php
$pdfDom = $pdf->getDom(['ignoreImages' => true]);

###Available Options

  • singlePage, default: false
  • imageJpeg, default: false
  • ignoreImages, default: false
  • zoom, default: 1.5
  • noFrames, default: true

Usage note for Windows Users

For those who need this package in windows, there is a way. First download poppler-utils for windows here http://blog.alivate.com.au/poppler-windows/. And download the latest binary.

After download it, extract it. There will be a directory called bin. We will need this one. Then change your code like this

<?php
// if you are using composer, just use this
include 'vendor/autoload.php';
use Gufy\PdfToHtml\Config;
// change pdftohtml bin location
Config::set('pdftohtml.bin', 'C:/poppler-0.37/bin/pdftohtml.exe');

// change pdfinfo bin location
Config::set('pdfinfo.bin', 'C:/poppler-0.37/bin/pdfinfo.exe');
// initiate
$pdf = new Gufy\PdfToHtml\Pdf('file.pdf');

// convert to html and return it as [Dom Object](https://github.com/paquettg/php-html-parser)
$html = $pdf->html();

// check if your pdf has more than one pages
$total_pages = $pdf->getPages();

// Your pdf happen to have more than one pages and you want to go another page? Got it. use this command to change the current page to page 3
$html->goToPage(3);

// and then you can do as you please with that dom, you can find any element you want
$paragraphs = $html->find('body > p');

?>

Usage note for OS/X Users

Thanks to @kaleidoscopique for giving a try and make it run on OS/X for this package

1. Install brew

Brew is a famous package manager on OS/X : http://brew.sh/ (aptitude style).

2. Install poppler

brew install poppler

3. Verify the path of pdfinfo and pdftohtml

$ which pdfinfo
/usr/local/bin/pdfinfo

$ which pdftohtml
/usr/local/bin/pdfinfo

4. Whatever the paths are, use Gufy\PdfToHtml\Config::set to set them in your php code. Obviously, use the same path as the one given by the which command;

<?php
// if you are using composer, just use this
include 'vendor/autoload.php';

// change pdftohtml bin location
\Gufy\PdfToHtml\Config::set('pdftohtml.bin', '/usr/local/bin/pdftohtml');

// change pdfinfo bin location
\Gufy\PdfToHtml\Config::set('pdfinfo.bin', '/usr/local/bin/pdfinfo');

// initiate
$pdf = new Gufy\PdfToHtml\Pdf('file.pdf');

// convert to html and return it as [Dom Object](https://github.com/paquettg/php-html-parser)
$html = $pdf->html();
?>

Feedback & Contribute

Send me an issue for improvement or any buggy thing. I love to help and solve another people problems. Thanks 👍

pdf-to-html's People

Contributors

mgufrone avatar ncjoes avatar pauliusjacionis avatar wernerkrauss avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pdf-to-html's Issues

not work in server

Hello @mgufrone , I have a problem uploading my project to the server.
I comment to you.
It worked perfectly on my local Windows machine using this code:

<?php
include 'vendor/autoload.php';
use Gufy\PdfToHtml\Config;
Config::set('pdftohtml.bin', __DIR__.'/Poppler/bin/pdftohtml.exe');
Config::set('pdfinfo.bin', __DIR__.'/Poppler/bin/pdfinfo.exe');
$p = 'sample.pdf';
$pdf = new Gufy\PdfToHtml\Pdf($p);
$pdf->html();
?>
But unfortunately the PHP code will run on Linux server where exe files could not run on. So I found there is a usage note for OS/X and Debian/Ubuntu to use:

Config::set('pdftohtml.bin', __DIR__.'/usr/local/bin/pdftohtml');
Config::set('pdfinfo.bin', __DIR__.'/usr/local/bin/pdfinfo');

Instead of:

Config::set('pdftohtml.bin', __DIR__.'/Poppler/bin/pdftohtml.exe');
Config::set('pdfinfo.bin', __DIR__.'/Poppler/bin/pdfinfo.exe');

  • How can I identify /usr/local/bin/ on a Linux server ?
  • How can I install Poppler-Utils for Linux ?
  • Is it possible to do all this on a shared hosting ?

I appreciate your help

How to use this on server

Hello Friend ,
I have used your API on windows it works perfectly But i have to use Poppler-0.47.0 on my root folder of window.Now i want use this on hosted server how can use Poppler-0.47.0 or any other way on server.

Help me out

i uses linux mint OS, can i get video of it... it vl b very helpful to understand by watching video.

linking to html files is wrong

frame name="links" src="output/file/file_ind.html"
frame name="contents" src="output/file/file-1.html"

Should be:

frame name="links" src="file_ind.html"
frame name="contents" src="file-1.html"


img width="892" height="1262" src="output/file/file001.png" alt="background image"

Should be:

img width="892" height="1262" src="file001.png" alt="background imag

Because they exist in the SAME folder

Doesn't work at Windows

After the installation with composer it gives an error
Fatal error: Class 'Gufy\PdfToHtml\Config' not found in C:\OpenServer\domains\phppdf.local\index.php on line 6

linking to html

linking to html issue is not resolved yet.

frame name="links" src="output/file/file_ind.html"
frame name="contents" src="output/file/file-1.html"

Should be:

frame name="links" src="file_ind.html"
frame name="contents" src="file-1.html"

Class 'Gufy\PdfToHtml\Config' not found

How to solve this?

I followed the docs..

I put use Gufy\PdfToHtml\Config; at the top and this is my controller

public function convertPdf(Request $request){

        // dd($request->all());    
        
        // change pdftohtml bin location
        \Gufy\PdfToHtml\Config::set('pdftohtml.bin', 'C:/wamp64/www/iaccs-admin-console/public/bin/pdftohtml.exe');

        // change pdfinfo bin location
        \Gufy\PdfToHtml\Config::set('pdfinfo.bin', 'C:/wamp64/www/iaccs-admin-console/public/bin/pdfinfo.exe');
        // initiate
        $pdf = new Gufy\PdfToHtml\Pdf($request);

        // convert to html and return it as [Dom Object](https://github.com/paquettg/php-html-parser)
        $html = $pdf->html();

        // check if your pdf has more than one pages
        $total_pages = $pdf->getPages();

        // Your pdf happen to have more than one pages and you want to go another page? Got it. use this command to change the current page to page 3
        $html->goToPage(1);

        // and then you can do as you please with that dom, you can find any element you want
        $paragraphs = $html->find('body > p');

        return $paragraphs;

    }

Treatment of HTML generated file

Hi,

I need to output the result instead of HTML file, I want to fill it in an array based on line and column.

<!! p style="position:absolute;top:515px;left:73px;white-space:nowrap" class="ft01">Item No.<!!/p>
<!! p style="position:absolute;top:515px;left:217px;white-space:nowrap" class="ft01">Description of Goods<!!/p>
<!! p style="position:absolute;top:515px;left:481px;white-space:nowrap" class="ft01">Units Qty<!!/p>

it should output:
$data[0][0] = Item No.
$data[0][1] = Description of Goods
$data[0][2] = Units Qty

Is there a way to do so?

Thank you

Problem running on CentOS 6

First of all, congratulations on your very good work!
My development environment consists of OSXs and Windows machines, and everything works pretty fine. I recently deployed my application to my homologation server (which is currently running CentOS 6) and noticed that the Base.php performs the following system call:

/usr/local/bin/pdftohtml -c -fmt png -zoom 1.5 'path_to_file/file.pdf' 'path_to_file/file.html'

However, the linux binary pdftohtml (which was downloaded from http://www.foolabs.com/xpdf/download.html) does not contain the -c, -fmt, -zoom options. Thus, the html file is not being generated at all.

Could you please help me on that?
Thanks very much!

How do I use this PHP scripted on my shared hosting without composer?

How can I use this PHP script, without composer (and without shell access) on my shared hosting server? Can I copy/paste the file in Filezilla (via FTP) to my shared hosting?
How can I fix this? Maybe make the autoload.php by myself?

Anyone who can help me! I'm really in need for this class, but my hosting doesn't support ssh / composer.

Thanks in advance!

Not working with wamp

Hi Sir,

I am using wampserver.
I am going to convert PDF to HTML. i am using your script but its not working.
My PHP version is 5.5.12
I am using following extensions for this.
extension=libfreetype-6.dll
extension=libgcc_s_sjlj-1.dll
extension=libjpeg-8.dll
extension=liblcms2-2.dll
extension=liblzma-5.dll
extension=libopenjpeg-1.dll
extension=libpng15-15.dll
extension=libpoppler-28.dll
extension=libpoppler-cpp-0.dll
extension=libstdc++-6.dll
extension=libtiff-5.dll

Please suggest how can i resolve it.
Thanks.

Windows NT require double-quotes instead of single-quotes

Windows OS require double-quotes for paths when there are spaces or other non-alphanumeric characters in the path.

Line 26 of class Gufy\PdfToHtml\Pdf uses single quotes, which I guess is compatible with Linux, but not Windows.

$content = shell_exec($this->bin()." '".$this->file."'");

It extension is work on the dedicated linux server?

I am little confuse how to pass the pdf file path. Because i added the path but not running any function on the server.
$html = $pdf->getInfo();
All of the function showing empty data.

But when i hit the "$page = $pdf->html();" function then facing following issue:
"Exception
You're asking to go to page 1 but max page of this document is 0",

If anyone let me know how to fix the following issue?
Thanks in advance.

goToPage() throws error

First of all thanks for your great work.
I will receive the following error when I try to change the page number.
If I don't use goToPage() method, page 1 will be rendered correctly.

Fatal error: Uncaught Error: Call to a member function goToPage() on string in /Applications/XAMPP/xamppfiles/htdocs/web/pdftohtml/index.php:17 Stack trace: #0 {main} thrown in /Applications/XAMPP/xamppfiles/htdocs/web/pdftohtml/index.php on line 17

This is my test php file:

html(); $html->goToPage(3); echo $html; $total_pages = $pdf->getPages(); echo $total_pages; ?>

any idea what I am doing wrong ?

thanks

How do i use this?

Hi i am trying to figure out how to use this with laravel 5.
This is what i tried to set the bin location... Still not quite sure with the difference between this two. I tried to point the config location to the location my pdf file is residing.

\Gufy\PdfToHtml\Config::set('pdftohtml.bin', '/pdf/');
\Gufy\PdfToHtml\Config::set('pdfinfo.bin', '/pdf/');

It's still returning null for this:

$pdf = new \Gufy\PdfToHtml\Pdf('testing.pdf');

Can someone share with me what went wrong?

Cheers,
Ralee

Composer not installed

You can't use this code without install composer and when you try to install it's display error.
"gufy/pdftohtml-php" not found.

Error: Document base stream is not seekable

Good afternoon, lately have presented me the following error to the attempt to convert my pdf in html:
Error: Document base stream is not seekable

know why this happens?

Setting the generate image path

Hi all,

I may have missed this in the documentation, but I can't seem to find it.
Is it possible to set the path of the generated HTML/images?

The script currently creates a folder in: vendor/gufy/pdftohtml-php/output/

I need to save/manipulate the images after they have been converted to HTML?

If there is no way to do this, can they be extracted as Bsse-64?

goToPage() throws error

Hi,
I have used version 2.0.7 on widnows and i am still getting below error:
Call to a member function goToPage() on string.

Please help me to sort it out.

Thanks

Formatting issue

Hi @mgufrone,

I've been using this library but none of my converted files are well formatted. I used your example test.pdf. Some of its content are misaligned. Do you have any idea where i can go to tweak the formats of the output?

Also is it me or table and images are not supported in this library?

Thanks for your time
Cheers
Ralee

taking extra spaces in html

I converted file using this library but i am not getting the same data text as appearing in pdf
like
pdf -> html
xxx yyy -> xxxyyy
Experian -> Ex p e ri an

it is troubling to get an exact data , need help asap
Thanks in advance..!!

mb_eregi_replace() Error

Any idea why i'm getting this error?

Fatal error: Uncaught Error: Call to undefined function PHPHtmlParser\mb_eregi_replace() in /var/www/html/superboringproject/vendor/paquettg/php-html-parser/src/PHPHtmlParser/Dom.php:362 Stack trace: #0 /var/www/html/superboringproject/vendor/paquettg/php-html-parser/src/PHPHtmlParser/Dom.php(184): PHPHtmlParser\Dom->clean('<body bgcolor="...') #1 /var/www/html/superboringproject/vendor/paquettg/php-html-parser/src/PHPHtmlParser/Dom.php(132): PHPHtmlParser\Dom->loadStr('<body bgcolor="...', Array) #2 /var/www/html/superboringproject/vendor/gufy/pdftohtml-php/src/Html.php(66): PHPHtmlParser\Dom->load('<body bgcolor="...') #3 /var/www/html/superboringproject/vendor/gufy/pdftohtml-php/src/Html.php(57): Gufy\PdfToHtml\Html->goToPage(1) #4 /var/www/html/superboringproject/vendor/gufy/pdftohtml-php/src/Html.php(16): Gufy\PdfToHtml\Html->getContents('sample.pdf') #5 /var/www/html/superboringproject/vendor/gufy/pdftohtml-php/src/Pdf.php(56): Gufy\PdfToHtml\Html->__construct('sample.pdf') #6 /var/www/html/superboringproject/pdf.ph in /var/www/html/superboringproject/vendor/paquettg/php-html-parser/src/PHPHtmlParser/Dom.php on line 362

Fatal error: Uncaught exception

I have added code but getting some error.

html(); var_dump($html); // check if your pdf has more than one pages $total_pages = $pdf->getPages(); // Your pdf happen to have more than one pages and you want to go another page? Got it. use this command to change the current page to page 3 $html->goToPage(3); // and then you can do as you please with that dom, you can find any element you want $paragraphs = $html->find('body > p'); ?>

Error getting is

Fatal error: Uncaught exception 'Exception' with message 'You're asking to go to page 1 but max page of this document is 0' in C:\xampp\htdocs\pdf-to-html-master\vendor\gufy\pdftohtml-php\src\Html.php:53 Stack trace: #0 C:\xampp\htdocs\pdf-to-html-master\vendor\gufy\pdftohtml-php\src\Html.php(48): Gufy\PdfToHtml\Html->goToPage(1) #1 C:\xampp\htdocs\pdf-to-html-master\vendor\gufy\pdftohtml-php\src\Html.php(10): Gufy\PdfToHtml\Html->getContents('file.pdf') #2 C:\xampp\htdocs\pdf-to-html-master\vendor\gufy\pdftohtml-php\src\Pdf.php(45): Gufy\PdfToHtml\Html->__construct('file.pdf') #3 C:\xampp\htdocs\pdf-to-html-master\index.php(14): Gufy\PdfToHtml\Pdf->html() #4 {main} thrown in C:\xampp\htdocs\pdf-to-html-master\vendor\gufy\pdftohtml-php\src\Html.php on line 53

How to handle special chars?

My code to convert pdf into html file is:

\Gufy\PdfToHtml\Config::set('pdftohtml.bin', '/usr/local/bin/pdftohtml');
\Gufy\PdfToHtml\Config::set('pdfinfo.bin', '/usr/local/bin/pdfinfo');

$pdf = new Pdf('MY_DOCUMENT_PATH.pdf');
$page = $pdf->html();
I tried to use $pdf->html() and $pdf->getDom(), I get the same error.

Everything is working fine but now in the pdf document are some special chars and I'm getting following errors message:

DOMDocument::loadHTML(): Invalid char in CDATA 0x1 in Entity, line: ...

I tried with $pdf->html() and $pdf->getDom(), I get the same error.

With libxml_use_internal_errors(true) I get no errors but after conversion there is double content.

How is it possible to avoid this error message or to remove special chars...?

vendor/gufy/pdftohtml-php must have write access

pdftohtml-php creates a folder under vendor/gufy/pdftohtml-php/output. If it doesn't have write access then loads of errors starting with mkdir() permission denied

Maybe add this to the readme

Or better still, have a config setting for setting the output folder eg. /tmp

(Tuto) How to run this library on OS/X ?

Hi Mochamad,

Thanks a lot for this awesome library !

I was writing a big ticket where I asked "HELPP, can you help me to run your lib on OS/X please ?" but by writing my issue, I've found the solution ! Then I thought it will be a good idea to share the tips.

1. Install brew
Brew is a famous package manager on OS/X : http://brew.sh/ (aptitude style).

2. Install poppler

brew install poppler

3. Verify the path of pdfinfo and pdftohtml

$ which pdfinfo
/usr/local/bin/pdfinfo

$ which pdftohtml
/usr/local/bin/pdfinfo

4. Whatever the paths are, use Gufy\PdfToHtml\Config::set to set them in your php code. Obviously, use the same path as the one given by the which command;

<?php
// if you are using composer, just use this
include 'vendor/autoload.php';

// change pdftohtml bin location
\Gufy\PdfToHtml\Config::set('pdftohtml.bin', '/usr/local/bin/pdftohtml');

// change pdfinfo bin location
\Gufy\PdfToHtml\Config::set('pdfinfo.bin', '/usr/local/bin/pdfinfo');

// initiate
$pdf = new Gufy\PdfToHtml\Pdf('file.pdf');

// convert to html and return it as [Dom Object](https://github.com/paquettg/php-html-parser)
$html = $pdf->html();
?>

Everything will work like a charm ! :-)
Stéphane

pdftohtml conversion only converting the 1st page or even less than that.

\Gufy\PdfToHtml\Config::set('pdftohtml.bin', '/usr/bin/pdftohtml'); \Gufy\PdfToHtml\Config::set('pdfinfo.bin', '/usr/bin/pdfinfo'); $pdf = new \Gufy\PdfToHtml\Pdf('/var/www/html/test_big/test.pdf'); $html = $pdf->html();
Thats my code. Its converting but only the 1st page for some cases and even less than that .

Spaces in PDF file names

If a PDF filename contains spaces pdfinfo will not work. This can be easily fixed by applying the following patch:

diff --git a/vendor/gufy/pdftohtml-php/src/Pdf.php b/vendor/gufy/pdftohtml-php/src/Pdf.php
index 842690a..78d5768 100644
--- a/vendor/gufy/pdftohtml-php/src/Pdf.php
+++ b/vendor/gufy/pdftohtml-php/src/Pdf.php
@@ -19,7 +19,7 @@ class Pdf
   }
   protected function info()
   {
-    $content = shell_exec($this->bin().' '.$this->file);
+    $content = shell_exec($this->bin(). " '" .$this->file . "'");
     // print_r($info);
     $options = explode("\n", $content);
     $info = array();

Maximum execution time error

Implementing this on windows , every dependencies installed now getting this issue
"Maximum execution time of 30 seconds exceeded in pdf-to-html\src\Base.php" i got below file object :
Gufy\PdfToHtml\Pdf Object ( [file:protected] => ./uploads/2/2.pdf [info:protected] => )
it is showing me protected,is that ok?
why this error , any clue?

PHP 7.1 + Compatibility

Seeing as this package is basically abandoned, I've decided to fork it. I will maintain it as well.

This library makes use of another (ironically) abandoned package for DOM Parsing. That package incorrectly implements the count() function on non-iterable instances, therein making it incompatible with PHP 7.1.

I've upgraded the underlying DOM Parsing engine and made cross-platform compatibility improvements.

You can use https://github.com/garrensweet/pdf-to-html this library for better support and error handling.

Error exception when converting pdf to html file on window machine?

I'm trying to convert pdf to html file on my window machine.
Following are the configuration lines.

    Config::set('pdftohtml.bin', 'D:\xampp\htdocs\poppler-0.51\bin\pdftohtml.exe');
    Config::set('pdfinfo.bin', 'D:\xampp\htdocs\poppler-0.51\bin\pdfinfo.exe');

I tried this with postman and attached image shows error. can you please help me to resolve this issue?

image

@mgufrone

error while trying to run

Fatal error: Uncaught exception 'Exception' with message 'You're asking to go to page 1 but max page of this document is 0' in C:\xampp\htdocs\ph\src\Html.php:63 Stack trace: #0 C:\xampp\htdocs\ph\src\Html.php(57): Gufy\PdfToHtml\Html->goToPage(1) #1 C:\xampp\htdocs\ph\src\Html.php(16): Gufy\PdfToHtml\Html->getContents('test.pdf') #2 C:\xampp\htdocs\ph\src\Pdf.php(56): Gufy\PdfToHtml\Html->__construct('test.pdf') #3 C:\xampp\htdocs\ph\src\Pdf.php(61): Gufy\PdfToHtml\Pdf->getDom() #4 C:\xampp\htdocs\ph\index.php(9): Gufy\PdfToHtml\Pdf->html() #5 {main} thrown in C:\xampp\htdocs\ph\src\Html.php on line 63

image

Require public method getPage

I need public method \Gufy\PdfToHtml\Html::getPage for parse html. \PHPHtmlParser\Dom::parse crashed my html on some tages (br, b) on some linux servers. I can't find reason.

Example:

public function getPage ($page) {
    return isset($this->contents[$page]) ? $this->contents[$page] : null;
}

Formatting & Rendering issue

I can't believe how amazing this library is! I am just having some rendering issues mainly checkbox form fields and the length of the underscores. Is there any way to change the rendering so that it will show an HTML checkbox rather than nothing? How can I go about customizing the replacement of PDF elements (Text areas, Text input, radio buttons)?

HTML Output Class meaning

Hi,
First of all, many thanks for this code, it works very well !

I have my html from a pdf parsing, and there is many class like ft03 ft04 ft02 ft01 (which seem to be the content) and ft08 ft09 ... (which seems to be other thing).
But as I read the full HTML code, there is no real logic:
For example: from a page to another, a simple text content without style would be ft03 and next page ft04 et next page again ft02 ...

image

I want extract and sort each pdf text content according to his own hierarchy, that's why I want to analyse these class.

If someone have some idea ?
Thank's by advance,

Using with Yii on localhost

This is an example of some very basic code. I did combine some class files and make some minor adjustments to get it functioning on the Yii framework.
`

         Yii::import('application.extensions.pdf2html.*');

         // change pdftohtml bin location
        Config::set('pdftohtml.bin', '/usr/local/bin/pdftohtml');

        // change pdfinfo bin location
       Config::set('pdfinfo.bin', '/usr/local/bin/pdfinfo'); 

          $pdf = "http://www.orimi.com/pdf-test.pdf";

         $pdf = new Pdf($pdf);

         //// convert to html string
         $html = $pdf->html();

         return $html;`

And I am getting this error. I did install poppler and verified the location with "which" but I am not sure what my issue is now. Thank you in advance this is the only thing that I think is going to work for what I am doing.

image

Why is this issue?

Warning: DOMDocument::loadHTML(): htmlParseEntityRef: expecting ';' in Entity, line: 27 in C:\xampp\htdocs\pdftohtml\vendor\gufy\pdftohtml-php\src\Html.php on line 45

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.