Comments (7)
Indeed the annotations displayed from PDF.js are incorrect, but it's not related to pdfalto, using macos system viewer for selection shows that the coordinates are correct for example the token "Association" :
I'm having a look at pdf.js processings.
from pdfalto.
The coordinates that are displayed by Preview with the selection are not correct, look:
The origin is already shifted... I think the coordinates I put above from PDF.js are the correct/expected ones for origin at (0,0) (x:57, y:90 for top left corner of "Association" and not x:85, y:126 from pdfalto or x:83.92, y:124.62 from Preview). Or somehow the size of the page is not correct and should be shifted/rescaled accordingly.
from pdfalto.
Ok I think I know the problem, actually there are different level for the boxes (media/crop/bleed) each is used by particular impression equipments, when these are not the same sized box it leads to such issues, I'll see how to fix this
from pdfalto.
So i've made a change to use crop box by default instead of media box : b14cd4e
from pdfalto.
This should be a dynamic option from pdfalto command line what do you think ?
from pdfalto.
Just reminder, this was legacy from pdf2xml..
from pdfalto.
Yes the issue was from pdf2xml !
Your fix entirely solves the issue for all my examples cases, and everything is fine with usual documents, so it's super many thanks!
from pdfalto.
Related Issues (20)
- empty image / svg
- compile error on RHEL 8.6 (Ootpa): /usr/bin/ld: cannot find -lstdc++ HOT 1
- Error case with invalid characters mapping
- Segmentation fault with pdf with comments
- Soft hyphens omitted HOT 3
- PDF to XML conversion time out for some files in server mode but run the pdfalto_server cmd in shell is fast and returns ok. HOT 1
- xpdf version 4.04
- ARM binaries for the Apple M1 HOT 3
- Cannot run pdfalto HOT 5
- PDF cause a crash with annotation option
- Building on arm64 Ubuntu Server 22.04 fails HOT 1
- Building for Apple Silicon failed due to missing directories (with manual fix) HOT 1
- Wrong characters / difference between extraction and display HOT 1
- [Suggestion] Reporting the byte location of images HOT 2
- Compilation error on arch linux HOT 1
- Error case, missing digits HOT 10
- Error case: double column, and line numbers
- No rule to make target `libs/image/png/mac/arm64/libpng.a'
- icu related crash HOT 1
- Missing Words while extracting from PDF HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pdfalto.