Comments (2)
I'm not an ABI expert or anything, but I have a feeling there's a problem in that generally direction.
I think you should try building both from source in a Python virtual environment that also has pybind11 installed, so that both also use the same pybind11 version. If you were using the pikepdf binary wheels (likely from the procedure you showed) then you're using a version that was linked against different (but theoretically ABI compatible) standard libraries from yours.
They may also be compiled with different pybind11 versions.
Unrelatedly, if you're only interested in text extraction from PDFs, pdfminer.six may be a better fit. pikepdf doesn't do text extraction, since that is very complex on its own. In general the sequence of characters to plot in a PDF content stream are glyph IDs in a font, not character IDs. If you're lucky there's a 1:1 correspondence between glyph IDs and Unicode, but in general this isn't the case.
from pikepdf.
First of all, thanks for the interest @jbarlow83
I also thought that the problem is that there is some kind of collision that occurs between the compiled code of fasttext and pikepdf's. However, I tried (as your instructions to install pikepdf from source say) to install both qpdf and pikepdf with an specified compiler (by setting CC and CXX flags to gcc and g++ respectively), but got the same error as you specify in these instructions (the ImportError one, because it seems that setup.py is not getting the same compiler even with the env variables CC and CXX set up), so I ended up giving up.
After that, I tried to install fasttext from this pip package: https://pypi.org/project/fasttext/
And the problem seems to be gone. I do not know if it is that pip is solving some kind of collision that was happening before, or if it is the fact that pip is compiling the binding with the same compiler. So for now, I am going to stick with that solution, because I am far from understanding how pybind11 works and why the Seg. Fault could be happening.
from pikepdf.
Related Issues (20)
- How to access OCR-data? HOT 1
- Import/Export annotations to FDF/JSON format HOT 1
- cannot install
- Accessibility Settings Issue HOT 1
- [ISSUE] Missing XMP Metadata HOT 1
- Traverse PDF layers and classes HOT 1
- Instalation errors
- How to replace the value of an Object mapped to a variable HOT 1
- pikepdf, libjpeg, l HOT 1
- 8.11.0: pytest requires `conftest` HOT 5
- Add 2 overlaid images to page with transparency mask HOT 2
- Add type checking for setting XMP metadata HOT 1
- RDF/XMP generated by pikepdf is incorrect?
- Question on file compression. HOT 1
- Re-enable 32-bit wheels on Windows HOT 2
- While emplacing a pdf, it changes underlying page metadata incorrectly HOT 1
- The pdf doc has been linearized, but the property "fast web view" still is “no" HOT 1
- A PDF with dodgy (yet apparently valid per qpdf --check) structure is causing a crash HOT 1
- generic_type: type "ObjectStreamMode" is already registered! HOT 1
- `docinfo_from_xmp()` fails on reduced precision dates (`YYYY` / `YYYY-MM`) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pikepdf.