cneud Goto Github PK
Name: Clemens Neudecker
Type: User
Company: Research @StaatsbibliothekBerlin
Location: Berlin, Germany
Blog: https://cneud.net/
Name: Clemens Neudecker
Type: User
Company: Research @StaatsbibliothekBerlin
Location: Berlin, Germany
Blog: https://cneud.net/
Browser based post correction tool for Alto XML files
calculate OCR confidence per page in ALTO
extract text from ALTO file
Python tools for performing various operations on ALTO XML files
edit the alto directly in the xml
Classification of Wittgenstein's remarks
Train word embeddings on DTA texts using fastText
Data Mining Historical Newspaper Metadata (METS/ALTO formats)
Javascript based portal for searching Europeana collections and creating enrichments on the metadata
A Survey of OCR Evaluation Tools and Metrics (HIP'21)
Interoperability layer supporting the loose coupling of software components developed during the IMPACT project
vDHd2021 experiment
Named Entity Recognition tool for Europeana Newspapers
Named Entity Recognition corpus for (historical) Dutch, French, German
Fast classification of newspaper pages using fastai
Conversions between various OCR formats
OCR & Ground Truth Resources
extract text from PAGE file
SCAPE demonstrator project for Taverna and Hadoop
:dart: String metrics and phonetic algorithms for Scala (e.g. Dice/Sorensen, Hamming, Jaccard, Jaro, Jaro-Winkler, Levenshtein, Metaphone, N-Gram, NYSIIS, Overlap, Ratcliff/Obershelp, Refined NYSIIS, Refined Soundex, Soundex, Weighted Levenshtein).
Warcbase is an open-source platform for managing and analyzing web archives
Forked from http://code.google.com/p/taverna/source/browse/portal/web-wf-design/trunk/web-wf-design/ for experimenting
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.