mehdibenamorr / german-wikipedia-text-corpus Goto Github PK
View Code? Open in Web Editor NEWThis project forked from t-systems-on-site-services-gmbh/german-wikipedia-text-corpus
This is a german text corpus from Wikipedia. It is cleaned, preprocessed and sentence splitted. It's purpose is to train NLP embeddings like fastText or ELMo Deep contextualized word representations.
Home Page: https://www.t-systems-onsite.de/impressum
License: Other