HTracker is WIP and not ready for usage yet.
- Add/remove URL from scrape list
- Configure filters per URL
- Configure scrape intervals per URL
- Provide web interface to show last updates of URL(s)
- Provide RSS feed for streaming changes
- Register/unregister email for notifications on changes on URL
- watcher - reaches out to set of sites
- feed - serves feed of changes for subscribers
- push news - push news out to subscribers
- Frequently, go through all subscribers, generate list of sites that need to be scraped and deduplicate them
- Scrape sites with similar filters/content types in batches
- Maybe: Notify notifier?
- Service providing access to site archive
- Maybe implement as RSS feed?
- Frequently, go through all subscriptions and notify if last notification is older than notification period (deduplicate by subscriber)
- Maybe: send notifications immediately, if triggered by watcher?
- NewScraper + opts per set of URLs
- Scraper.Start()
- NewExporter() -> register results (date, txt, checksum, diff)