Bram Zijlstra's Projects
A curated list for references to Dutch NLP libraries, datasets, and interesting literature.
downloads and parses subtitle dataset from opensubtitles.org
OPUS (opus.nlpl.eu) Python3 API
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
Code for generating synthetic Dutch text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016. This repo is a fork