Code Monkey home page Code Monkey logo

extractor-wiki-data's Introduction

extractor-wiki-data

this script use to extract multi-language data from wikidata. anyone can freely use and update this code.

Dumpping wiki-data

wget http://dumps.wikimedia.org/wikidatawiki/latest/wikidatawiki-latest-pages-articles.xml.bz2

Running script

$ bzip2 -dc ./wikidatawiki-latest-pages-articles.xml.bz2 | python ./extractor-wiki-data.py  2> /dev/null  | head -n 30
Q15     labels  en      Africa
Q15     labels  ja      アフリカ
Q15     labels  zh      非洲
Q15     labels  ko      아프리카
Q15     labels  fr      Afrique
Q15     labels  ar      أفريقيا
Q15     labels  de      Afrika
Q15     labels  es      África
Q15     labels  tr      Afrika
Q15     labels  vi      châu Phi
Q15     labels  pt      África
Q15     labels  ru      Африка
Q15     descriptions    en      continent
Q15     descriptions    ja      大陸
Q15     descriptions    zh      七大洲之一
Q15     descriptions    ko      아시아 다음으로 면적이 넓고 인구가 많은 대륙
Q15     descriptions    fr      continent
Q15     descriptions    ar      قارة
Q15     descriptions    de      Kontinent
Q15     descriptions    es      continente
Q15     descriptions    pt      continente
Q15     descriptions    ru      второй по площади континент после Евразии,омываемый Средиземным морем с севера
Q15     wiki    enwiki  Africa
Q15     wiki    jawiki  アフリカ
Q15     wiki    zhwiki  非洲
Q15     wiki    kowiki  아프리카
Q15     wiki    frwiki  Afrique
Q15     wiki    arwiki  أفريقيا
Q15     wiki    dewiki  Afrika
Q15     wiki    eswiki  África
...

extractor-wiki-data's People

extractor-wiki-data's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.