Topic: webarchiving Goto Github
Some thing interesting about webarchiving
Some thing interesting about webarchiving
webarchiving,Wayback Machine API interface & a command-line tool
User: akamhy
Home Page: https://pypi.org/project/waybackpy/
webarchiving,Decentralized web archiving
Organization: archiveteam
webarchiving,Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Organization: archiveteam
Home Page: https://www.archiveteam.org/
webarchiving,Makes saving pages in bulk to the wayback machine much easier
User: archivingtoolsforwbm
webarchiving,Extracts links from DSpace repositories
Organization: arquivo
webarchiving,Digital archive of web pages related to the Guild of Information Networks
Organization: athenekilta
Home Page: https://athene.fi/arkisto/
webarchiving,pywb recorder over tor, anonymously records the web. (docker image)
User: atomotic
Home Page: https://pywb.readthedocs.io/en/develop/manual/recorder.html
webarchiving,record current active tab on webrecorder.io
User: atomotic
webarchiving,🗄 File-Based Reference Filing System.
Organization: basenana
webarchiving,Quick Cache and Archive search buttons
User: cipher387
Home Page: https://cybdetective.com
webarchiving,Various Jupyter notebooks about Common Crawl data
Organization: commoncrawl
webarchiving,metawarc: a command-line tool for metadata extraction from files from WARC (Web ARChive)
Organization: datacoon
webarchiving, Given four bytes, download a random file from web archives implementing the UKWA Shine interface
Organization: exponential-decay
webarchiving,A archiving utility with an interface for web servers.
User: gitdev-bash
webarchiving,WARC + AI - Experimental Retrieval Augmented Generation Pipeline for Web Archive Collections.
Organization: harvard-lil
webarchiving,Digital Preservation of HTTP in documentary heritage.
Organization: httpreserve
webarchiving,A helper package to tokenize textual content and retrieve hyperlinks
Organization: httpreserve
webarchiving,A wrapper for phantom.js commands for headless screenshots.
Organization: httpreserve
webarchiving,Tika based link (URL) extractor for httpreserve
Organization: httpreserve
webarchiving,A restrictied API in Golang for the (semi)-exposed functions of the internet archive.
Organization: httpreserve
webarchiving,Client app for httpreserve pkg that generates CSV, JSON, HTTP, and BoltDB
Organization: httpreserve
webarchiving,A set of web archival replay test cases
User: ibnesayeed
Home Page: https://ibnesayeed.github.io/archival-tests/
webarchiving,An Awesome List for getting started with web archiving
Organization: iipc
webarchiving,A list of things related to software, literature, and other content for 🕣 Memento
User: machawk1
webarchiving,Parse a Heritrix crawl.log into an XML sitemap
User: mijho
webarchiving,Link crawler for a phpBB forum
Organization: mozillacz
webarchiving,Parse CDXJ(https://github.com/oduwsdl/ORS/wiki/CDXJ) files with node.js
User: n0tan3rd
webarchiving,Parse And Create Web ARChive (WARC) files with node.js
User: n0tan3rd
webarchiving,Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head
User: n0tan3rd
Home Page: https://n0tan3rd.github.io/Squidwarc/
webarchiving,
User: n0tan3rd
webarchiving,A tool for detecting viruses and NSFW material in WARC files
Organization: natliblux
webarchiving,News Archiver, Data Aggregation for CNN and Fox News
Organization: news-archiver
Home Page: https://newsarchiverdiff.com/
webarchiving,An archival thumbnail visualization server
Organization: oduwsdl
webarchiving,A social media open post web archiving tool
User: peterk
webarchiving,A dockerized, queued high fidelity web archiver based on Squidwarc
User: peterk
webarchiving,From WARC records to MongoDB documents
User: pierlauro
webarchiving,Awesome list dedicated to digital and data preservation tools, sources, services and so on.
Organization: ruarxive
webarchiving,This repository contains work done to determine how much of www.guideline.gov and qualitymeasures.ahrq.gov were archived.
User: shawnmjones
Home Page: https://ws-dl.blogspot.com/2018/07/2018-07-15-how-well-are-national.html
webarchiving,Offline storage of website data on Android
User: simonkocurek
webarchiving,Parser for WARC (aka WebArchive) files
Organization: toimik
webarchiving,Nástroj pro archivaci webových stránek na Wayback Machine
Organization: ubuntucz
webarchiving,Seeder - Czech webarchive curating tool and public site
Organization: webarchivcz
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.