dmester / pdftosvg.net Goto Github PK
View Code? Open in Web Editor NEWFully managed .NET library for converting PDF files to SVG.
Home Page: https://pdftosvg.net
License: Other
Fully managed .NET library for converting PDF files to SVG.
Home Page: https://pdftosvg.net
License: Other
Hello
We test this file:
Untitled design (7).pdf
int the web online: pdftosvg.net the text is correct as:
in unity nugget package, it show as:
the text: it's -> it AAAA s
Do we need to set encoding somewhere ?
Best regards
Hello
we have tested the pdf file for CoonsPatchMesh.
CoonsPatchMesh file: meshgradient.pdf
it is black at the area of CoonsPatchMesh
If a PDF includes fonts, that are not on the target machine, the used font-family name is documented in the PDF but not reflecting in the SVG.
In Acrobat reader the correct font name can be seen:
In the SVG it appears like this :
.tx3KyHh{font-family:monospace;font-weight:bold;font-size:14.3px;}
The reported font-family is expected to be the correct one.
This will make it easier to use the font if installed or to select a suitable substitution font.
Hello
There are many js lib to interact with html div (pan/pinch/rotate) easily. It is difficult to edit svg file (ex: adding video to svg hierarchy, animating svg group, ...).
We can import svg to html5 canvas (as fabrics, konva, ...) to edit for simple svg file. For large complex svg file, it is very slow responsive and large memory. we can not add a lot svg to canvas.
Is there any solution to export each svg>g>g to separate svg file with coordinate in the whole svg view box, then later we can put each exported svg file to div inside html ?
Best regards
We received two sample PDF files from a customer. They claimed that the "loading of the pdf" was incredibly slow.
And this turned out to be true:
It appears that "Optimize" tries to remove too many XML nodes. As you can see, "RemoveNode" takes 97,6% of the time.
We will try to investigate ourselves. However, it would be nice if you could give us a hint :-)
Just drop them into the 3rdParty folder and run the ConvertSync test and you'll see.
I want to extract all the images.
Hello
The svg output structure is hierarchy svg>g>elements, the elements is or or text. It usually not group each shape to one , some time contain one or more shape (one or more object)
Is there option to make one shape to one or ? example: a car, a house, ... each of them grouped to svg>g>g>a car, svg>g>g>a house. Some time the output as svg>g>g>a car, a house
We test with PDFJS, it alway output one shape (object) in one (svg>g>g>a car)
Best regards
Before reporting an issue, please ensure:
Hello,
I'm hoping to replace our use of Convertio and Inkscape with your fine library. I'm running into an issue with the SVG:s generated though. One of our use cases is to import SVGs into PowerPoint, but PowerPoint does not have support for shorthand font attributes (font: italic bold 22pt Arial
), they will just become default fonts. However, "longhand" works fine (font-family, font-size, font-weight, font-style).
I noticed in the source code that this is actually an optimisation that is done when outputting style classes and I was curious if you were willing to make it optional? Otherwise I will have to do a post processing step to undo the optimisation.
I can even provide a PR if you have limited time.
Hello,
I found a small issue concerning the handling of transparency on some images.
When converting the pdf page in attachment, we can see that three images on the page, that initially have transparent background, have a white background when converted to svg.
Hello,
We need to pass single pdf page to other lib to extract html, We want to have each pdf page in each pdf file.
How to save the PdfPage to pdf file (1 page for one pdf file) ?
Would you like to make MediaBox Property public ?
Best regards
Hi, first I wanna thank you for sharing your great work here.
This library is the most reliable one to suit almost all my needs, great work ! 👍
I just have a very specific font issue today :
I am converting PDF pages, one by one, splitted by another library, named Ghostscript.
For the same PDF source file :
Source PDF :
SVG ouput :
Link to PDF source file
Link to generated SVG
Source PDF :
SVG ouput :
Link to PDF source file
Link to generated SVG
It seems that the generated @font-face in the SVG file is not working here, I tried to manually replace it by the same @font-face than the first SVG, and everything looks fine. I assume that the font format is a little different due to Ghostscript version. I am aware that Ghostscript is not your concern, but as the font is well displayed in the PDF file, maybe there is a particular case to take into consideration ?
Let me know if I can provide additional information.
CLEAR_NT_MAPS_Tok-Pisin_02-cropped.pdf
I am using the command line converter. Why are some text obfuscated and other not? Can this be disabled?
L1929: Dekapolis
L1939: K
Thanks for an awesome tool!
Before reporting an issue, please ensure:
When converting the PDF to SVG the folder structure is not matching with PDF. below is my used code
try
{
using (var doc = PdfDocument.Open(@"E:\PdfToSvg\Pdf" + fileName + ".pdf"))
{
var pageNo = 1;
var option = new SvgConversionOptions();
foreach (var page in doc.Pages)
{
page.SaveAsSvg(@"E:\PdfToSvg\Svg" + fileName + "-" + pageNo.ToString() + ".svg", option);
pageNo++;
}
}
return true;
}
Hello, I encounter a problem when converting a PDF made from LaTeX source files to SVG. Indeed as you can see on the pictures below some special characters are missing or others are incorrect. I can note that I do not encounter this problem with PDFs made with Word. You will find attached one of the files in question.
Link to PDF document
https://drive.google.com/file/d/1L0jBDQKHZ78WB-jKuri1AGj2SrkvlXrD/view?usp=share_link
First of all: Thank you so much for this awesome library!
We try to convert customers plans into svg. This mostly works like a charm,
however, we have issues with embedded fonts.
It turns out that woff or oft does not make a difference - so it is definitely not a bug in the WoffBuilder
that translates otf to woff.
The original PDF can be downloaded here: http://gofile.me/2itnX/zkvm6p7g0
the converted OTF version of the svg here: http://gofile.me/2itnX/Se5N8lQGQ
the converted WOFF version of the plan here: http://gofile.me/2itnX/uze6Bpebf
The issue is with text translation.
in the pdf, there is a gray box showing the room sizes. The ² is off a bit
More drastic is the overlapping in the footer of the document.
In the pdf:
Can you give us any hint where we should start searching?
Like: Is there any OTF features that you left out?
Hi,
I am having an issue to convert a specific PDF file in SVG, especially for one page. I use the latest version of your library (1.3.0).
Concerned file :
essai.pdf
Command used :
pdftosvg.exe essai.pdf
Thanks in advance for your kind help.
Let me know if I can provide more information.
Hi,
I just encountered an issue when trying to convert a PDF document, two pages of this document cannot be converted.
These pages are isolated with Ghostscript's latest version, from another PDF file.
I try to convert them one by one, using these commands :
pdftosvg.exe page2.pdf page2.svg --pages 1 --non-interactive --no-color
pdftosvg.exe page29.pdf page29.svg --pages 1 --non-interactive --no-color
I use version 1.2.0 of your library.
The two concerned files :
page2.pdf
page29.pdf
Thanks in advance for your kind help.
Let me know if I can provide more information.
I have a very simple PDF. The converter is crashing!
PDF has a size limit of 14.400x14.400 PDFUnits, which is sufficient for common Office documents.
For a document with a dimension larger than 5 meter (´printed on plotter with roll format) however it gets relevant.
(See page 650 of https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf)
Below is an example of a (empty) 35m long drawing, which is the largest size I met so far.
PDF uses a UserUnit of 7 to represent this drawing within its limits.
That means, that coordinates within the generated SVG have to be multiplied by 7 to get the absolute values.
To achieve correct representation of the SVG, the attributes "width" and "height" should be scaled by the UserUnit ( here x7 ) of the PDF page and the viewBox should be kept as is.
Not: <svg width="14173" height="341" viewBox="0 0 14173.2 340.562" xmlns="http://www.w3.org/2000/svg">
But: <svg width="99212" height="2387" viewBox="0 0 14173.2 340.562" xmlns="http://www.w3.org/2000/svg">
Sample PDF: Userunit.pdf
For some documents symbols (e.g. Celsius degree symbol, or Greek letter) not converted correctly. For example:
It seems to be something wrong with font /F14, here piece of content buffer related to first picture:
Tm_(< 25)Tj_/F14 → 1 Tf_1.7474 0 TD_0 Tw_(m)Tj_/F9
Example pdf:
getdatasheetpartid-359780-15107033.pdf
Hello
we have tested the pdf file for font type1 + image
type font1 file:
complexfont1.pdf
it is black at the area of photos
At a first glance it seems to make no sense of extracting hidden text. But some of the PDF's contain vectorgraphics for exact representation of the content and in parallel there is hidden text, to perform search functionality.
Hidden text in PDF could be extracted in a SVG group with invisible text:
<g class="HiddenText" style="fill:none;">
<text x="100" y="200">This text is hidden</text>
</g>
This would allow search also in the SVG. The hidden text can be noticed on Strg-A within the included sample.svg
Hello,
pdftosvg worked very well on all the documents I tried but I report a problem with the following file:
Link to PDF
from which I get this output:
Link to SVG
As you can see from the SVG, all the text is shifted up from what the PDF viewer is showing.
Thanks.
Hello
We convert this pdf to svg, the content do not fit to viewport. (correctly converting with pdfjs)
It seem bug.
Best regards
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.