Comments (8)
Any update on this?
I finally got this installed after fidgeting with pip and dependencies all day only to find out uploading a file just constantly refreshes the page without doing anything.
Running Python 3.7, Ghostscript 10.00.0
from excalibur.
fwiw, this docker image is quite recent and just works ghcr.io/williamjacksn/excalibur
from excalibur.
Same situation.
from excalibur.
So for anyone else struggling to get the right dependencies installed, I found this docker image that still works like a charm.
from excalibur.
I am new to ALL of this. I am having the same issue. It gets stuck in a 304 loop while parsing a downloaded copy of the document I chose to test this on (the Arizona Math Standards pdf to fit my use case). It ran for several hours before I discovered this problem. Now I am trying the "docker image" above. In order to use it I had to download the docker desktop and ran the resulting docker image from the above link posted by r5574. It doesn't work for me. I run the the image and try to navigate to the new address (from the Docker Desktop Output) in my browser (http://0.0.0.0:80/) and all I get is:
"Unable to connect
An error occurred during a connection to 0.0.0.0:80.
The site could be temporarily unavailable or too busy. Try again in a few moments.
If you are unable to load any pages, check your computer’s network connection.
If your computer or network is protected by a firewall or proxy, make sure that Firefox is permitted to access the web."
I get a similar error for the localhost address (127.0.0.1:80...). When running Excalibur from the Windows Command line (using the windows package) at least it loaded the web page properly. But I got the loop. Is there any way I can run this (Excalibur) without the loop on Windows? Where does the Docker Desktop download the project... perhaps if I could run it from the Windows command line?
Or is there any other way to extract PDFs to tables without having to create an extraction template? I have tried talking to a company that offers a commercial app but they said I had to create a template for extraction. I want to be able to download teaching standards from anywhere on the internet and extract the standards from the pdf document. I can't use a template. Which is why I came here... (plus its free lol...)
from excalibur.
Sadly, Had no luck with those docker images on windows 10
from excalibur.
Duplicate issue of:
#154
#117
#69
from excalibur.
ghcr.io/williamjacksn/excalibur
This docker image only occasionally works for me on MacOS Sonoma with an M1.
The website always loads to the home page correctly, but beyond that things don't work reliably for me. Most of the time I get an endless loading screen as it tries to convert the pdf into images. Things seemed to get better though oddly when I killed the docker process during the endless 'converting to image' step, and then restarting. Some other things I've noticed are that 'stream' is less likely to result in endless loading than 'lattice'.
from excalibur.
Related Issues (20)
- Stream does not detect similar tables in the same document HOT 1
- ImportError: cannot import name 'escape' from 'jinja2' HOT 5
- Internal Server Error can't Download data HOT 3
- I am not getting the table result as it is on the PDF
- Error 304 in the logs HOT 1
- data error
- Table with merged cells not recognized correctly HOT 1
- Error during `excalibur initdb` on Windows 10 HOT 1
- Unable to extract all rows from a pdf
- excet format of table
- py -m excalibur initdb Windows 10 Error HOT 1
- Maintainers needed HOT 16
- Excalibur camelot cycles HOT 2
- MutableMapping error message HOT 9
- taking more time HOT 2
- pycrypto dependency is unsupported and won't install on Mac OS 14. HOT 1
- STREAM Flavour show white
- WARNING: Failed to generate report: No data to report.
- ModuleNotFoundError: No module named 'importlib.metadata'
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from excalibur.