Comments (3)
This also happens with enwiki-20180320-pages-articles.xml.bz2
Exception in thread "main" de.tudarmstadt.ukp.jwktl.api.WiktionaryException: Unable to save page Coca-Cola HBC AG
at de.tudarmstadt.ukp.jwktl.parser.WiktionaryArticleParser.saveParsedWiktionaryPage(WiktionaryArticleParser.java:156)
at de.tudarmstadt.ukp.jwktl.parser.WiktionaryArticleParser.onPageEnd(WiktionaryArticleParser.java:105)
at java.lang.Iterable.forEach(Iterable.java:75)
...
from dkpro-jwktl.
Just tried en wiktionary 20180420 with JWKTL 1.1.0. It worked without problems on Linux. Are you sure that you have the right dump? Classics are using Wikipedia rather than Wiktionary... Other than that, it could be a platform issue. You could try debugging to see what happens...
from dkpro-jwktl.
I feel a bit foolish, as I was trying to parse a Wikipedia dump file. The wiktionary dump files are named enwiktionary-20180501-pages-articles.xml.bz2
It worked perfectly once I had the correct input file.
from dkpro-jwktl.
Related Issues (20)
- Polish word forms HOT 1
- English word forms HOT 1
- Retrieve data in different language than query string? HOT 2
- Support Plural* in noun table
- Build fails under Java 9 due to java.lang.StringIndexOutOfBoundsException in maven-javadoc-plugin HOT 1
- Fix Javadoc warnings
- Model different versions of grammatical number in word forms HOT 14
- Add gender to singular word forms in German HOT 9
- ArrayIndexOutOfBoundsException when parsing invalid translation template Üt
- Change contributor attribution
- Consider `mn` and `mfn` when parsing gender HOT 1
- Add inflection group property to the word form HOT 2
- Create word forms for "Deutsch Substantiv Übersicht -sch"
- Build fails unter Java 8 due to Javadoc errors
- getPlainText() does not remove wiki markup for IWiktionaryExample
- jwktl hangs on russian orthography HOT 2
- JWKTL enters in an infinite loop when parsing translation for German personal pronoun HOT 1
- Pictures in Wiktionary cause incorrect parsing
- Exception Raised During Preprocessing of Wikidictionary Dump
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dkpro-jwktl.