Comments (6)
We could have one 404 error page by website
from suckit.
As long as you're aware that is an opinionated choice :) some sites have custom 404s by section of the site etc, some will keep the original URL like in my screenshot, some will redirect to a dedicated 404 URL, some will show a 404 page with a 200 response.. Web crawling is messy!
Perhaps this could be a configuration thing, but that's up to you :)
from suckit.
A good solution can be to hash a 404 or 200 webpage. This way if the page is specific to this URL it is saved, if not we could make a symbolic link to the generic one.
from suckit.
Yea I think it's tricky. If it's legitimately just a bad link to a page that was never existed or a href that was relative when it shouldn't have been you might hit an infinite loop (i've seen this in practise).
from suckit.
Humm ok. We have more serious issues and very little time currently, we will give this a try latter
from suckit.
Yea no rush :)
from suckit.
Related Issues (20)
- Quoting issue on charset detection HOT 3
- Unicode handling of --include and --exclude HOT 8
- Give tl a try HOT 1
- Solved: error: linkr 'cc' not found during install HOT 2
- Proxy support HOT 8
- Panic when folder path with dot serves a webpage HOT 3
- Incorrect local URLs on an index_no_slash.html HOT 2
- Failure in name resolution on books.toscrape.com HOT 1
- Fix release cross compilation CI HOT 1
- Only download certain filetypes HOT 1
- Crash with v0.2.0 HOT 3
- Build for riscv64
- Exclude already downloaded file HOT 1
- Make URL Processing optional HOT 3
- Add URLs to depth tree from CSS HOT 4
- Stuck thread on silent connection close HOT 12
- Create issue template
- Resume download for large websites HOT 2
- Moving currently downloaded files and folders HOT 1
- Not a directory (os error 20) - Error while cloning wiki.raregamingdump.ca HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from suckit.