Code Monkey home page Code Monkey logo

wikiforia's Introduction

Wikiforia

What is it?

It is a library and a tool for parsing Wikipedia XML dumps and converting them into plain text for other tools to use.

Why use it?

Subjectivly generates good results and is reasonably fast, on my laptop (4 physical cores, 8 logical threads, 2.3 Ghz Haswell Core i7) it achieves an average of 6000 pages/sec or 10 minutes for a 2014-08-18 Swedish Wikipedia dump. Your results may of course vary.

How to use?

Download a multistreamed wikipedia bzip2 dump. It consists of two files: one index and one with the pages.

For a Swedish Wikipedia dump 2014-08-18 it has the following file names:

svwiki-20140818-pages-articles-multistream-index.txt.bz2
svwiki-20140818-pages-articles-multistream.xml.bz2

Make sure their names are intact because otherwise the automatic language resolving does not work. The default language is English and it does affect the parsing quality.

Both compressed files must be placed in the directory for the command below to work properly.

To run it all: go to the dist/ directory in your terminal and run

java -jar wikiforia-1.0-SNAPSHOT.jar 
     -pages [path to the file ending with multistream.xml.bz2] 
     -output [output xml path]

This will run with default settings i.e. the number of cores you have and a batch size of 100. These settings can be overriden, for a full listing just run:

java -jar wikiforia-1.0-SNAPSHOT.jar

Output

The output from the tool is an XML with the following structure (example data)

<?xml version="1.0" encoding="utf-8"?>
<pages>

<page id="4" title="Alfred" revision="1386155063000" type="text/x-wiki" ns-id="0" ns-name="">Alfred, 
with a new line</page>

<page id="10" title="Template:Infobox" revision="1386155040000" type="text/x-wiki" ns-id="10" ns-name="Template">Template stuff</page>
</pages>

Attribute information

id
Wikipedia Page id
title
The title of the Wikipedia page
revision
The revision as given by the dump, but in milliseconds since UNIX epoch time
type
the format, will always be text/x-wiki in this version of the tool
ns-id
The namespace id, 0 is the principal namespace which contains all articles, take a look at Namespaces at [Wikipedia for more information](http://en.wikipedia.org/wiki/Wikipedia:Namespace)
ns-name
Localized name for the namspace, for 0 it is usually just an empty string

Plaintext export

Contributed by @smhumayun, support for Plain Text output format on top of existing XML format. Use Case: extract text only from the Wikipedia e.g in order to use it as a Corpus for different Machine Learning experiments.

To run it: Download wikiforia-x.y.z.jar from dist/ directory, open your terminal, go/cd to download location and run

java -jar wikiforia-x.y.z.jar 
     -pages [path to the file ending with multistream.xml.bz2] 
     -output [output xml path]
     -outputformat plain-text

Remarks

Empty articles, for which no text could be found is not included. This includes redirects and most of the templates and categories, because they have no useful text. If you use the API you can extract this bit of information.

Language support

270 language specific configurations have been generated from the Wikimedia source tree that is publicly available. The quality of these autogenerations are uncertain as they are not tested. Kindly confirm or report if your language does not work so that I could possibly mitigate the issue.

The English language is used as fallback when parsing.

API

The code can also be used directly to extract more information.

More information about this will be added, but for now take a look at se.lth.cs.nlp.wikiforia.App and the convert method to get an idea of how to use the code.

Credits

Peter Exner, the author of KOSHIK. The Sweble code is partially based on the KOSHIK version.

Sweble, developed by the Open Source Research Group at the Friedrich-Alexander-University of Erlangen-Nuremberg. This library is used to parse the Wikimarkup.

Woodstox, Quick XML parser, used to parse the XML and write XML output.

Apache Commons, a collection of useful and excellent libraries. Used CLI for the options.

Wikipedia, without it, this project would be useless. Testdata has been extracted from Swedish Wikipedia and is covered by CC BY-SA 3.0 licence.

Licence

The licence is GPLv2.

wikiforia's People

Contributors

fbroda avatar marcusklang avatar simonlindberg avatar smhumayun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

wikiforia's Issues

Invalid white space character exception

Hi,

I'm using wikiforia (version 1.1.1) to parse the english wikipedia dump ("version" 20150602) and encounter this error and it makes wikiforia stop. Below is the log. How can I fix this?

java.io.IOError: com.ctc.wstx.exc.WstxIOException: Invalid white space character (0x14) in text to output (in xml 1.1, could output as a character entity)
at se.lth.cs.nlp.io.XmlWikipediaPageWriter.process(XmlWikipediaPageWriter.java:91) ~[wikiforia-1.1.1.jar:?]
at se.lth.cs.nlp.pipeline.AbstractEmitter.output(AbstractEmitter.java:44) ~[wikiforia-1.1.1.jar:?]
at se.lth.cs.nlp.pipeline.IdentityFilter.process(IdentityFilter.java:11) ~[wikiforia-1.1.1.jar:?]
at se.lth.cs.nlp.pipeline.AbstractEmitter.output(AbstractEmitter.java:44) ~[wikiforia-1.1.1.jar:?]
at se.lth.cs.nlp.wikipedia.parser.SwebleWikimarkupParserBase.process(SwebleWikimarkupParserBase.java:91) ~[wikiforia-1.1.1.jar:?]
at se.lth.cs.nlp.pipeline.AbstractEmitter.output(AbstractEmitter.java:44) ~[wikiforia-1.1.1.jar:?]
at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser.access$300(MultistreamBzip2XmlDumpParser.java:43) ~[wikiforia-1.1.1.jar:?]
at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser$Worker.run(MultistreamBzip2XmlDumpParser.java:349) ~[wikiforia-1.1.1.jar:?]
Caused by: com.ctc.wstx.exc.WstxIOException: Invalid white space character (0x14) in text to output (in xml 1.1, could output as a character entity)
at com.ctc.wstx.sw.BaseStreamWriter.writeCharacters(BaseStreamWriter.java:462) ~[woodstox-core-asl-4.2.0.jar:4.2.0]
at se.lth.cs.nlp.io.XmlWikipediaPageWriter.process(XmlWikipediaPageWriter.java:85) ~[wikiforia-1.1.1.jar:?]
... 7 more
Caused by: java.io.IOException: Invalid white space character (0x14) in text to output (in xml 1.1, could output as a character entity)
at com.ctc.wstx.api.InvalidCharHandler$FailingHandler.convertInvalidChar(InvalidCharHandler.java:55) ~[woodstox-core-asl-4.2.0.jar:4.2.0]
at com.ctc.wstx.sw.XmlWriter.handleInvalidChar(XmlWriter.java:623) ~[woodstox-core-asl-4.2.0.jar:4.2.0]
at com.ctc.wstx.sw.BufferingXmlWriter.writeCharacters(BufferingXmlWriter.java:554) ~[woodstox-core-asl-4.2.0.jar:4.2.0]
at com.ctc.wstx.sw.BaseStreamWriter.writeCharacters(BaseStreamWriter.java:460) ~[woodstox-core-asl-4.2.0.jar:4.2.0]
at se.lth.cs.nlp.io.XmlWikipediaPageWriter.process(XmlWikipediaPageWriter.java:85) ~[wikiforia-1.1.1.jar:?]
... 7 more
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:400)
at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser$ParallelDumpStream.getNext(MultistreamBzip2XmlDumpParser.java:286)
at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser$ParallelDumpStream.read(MultistreamBzip2XmlDumpParser.java:305)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.init(BZip2CompressorInputStream.java:232)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.complete(BZip2CompressorInputStream.java:348)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.initBlock(BZip2CompressorInputStream.java:284)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.setupNoRandPartA(BZip2CompressorInputStream.java:868)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.setupNoRandPartB(BZip2CompressorInputStream.java:917)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.read0(BZip2CompressorInputStream.java:217)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.read(BZip2CompressorInputStream.java:172)
at com.ctc.wstx.io.BaseReader.readBytes(BaseReader.java:155)
at com.ctc.wstx.io.UTF8Reader.loadMore(UTF8Reader.java:368)
at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:111)
at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:87)
at com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57)
at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:991)
at com.ctc.wstx.sr.BasicStreamReader.readTextSecondary(BasicStreamReader.java:4647)
at com.ctc.wstx.sr.BasicStreamReader.readCoalescedText(BasicStreamReader.java:4147)
at com.ctc.wstx.sr.BasicStreamReader.getElementText(BasicStreamReader.java:679)
at se.lth.cs.nlp.mediawiki.parser.XmlDumpParser.processPages(XmlDumpParser.java:276)
at se.lth.cs.nlp.mediawiki.parser.XmlDumpParser.next(XmlDumpParser.java:337)
at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser$Worker.run(MultistreamBzip2XmlDumpParser.java:345)
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:400)
at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser$ParallelDumpStream.getNext(MultistreamBzip2XmlDumpParser.java:286)
at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser$ParallelDumpStream.read(MultistreamBzip2XmlDumpParser.java:305)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.init(BZip2CompressorInputStream.java:232)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.complete(BZip2CompressorInputStream.java:348)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.initBlock(BZip2CompressorInputStream.java:284)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.setupNoRandPartA(BZip2CompressorInputStream.java:868)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.setupNoRandPartB(BZip2CompressorInputStream.java:917)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.read0(BZip2CompressorInputStream.java:217)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.read(BZip2CompressorInputStream.java:172)
at com.ctc.wstx.io.BaseReader.readBytes(BaseReader.java:155)
at com.ctc.wstx.io.UTF8Reader.loadMore(UTF8Reader.java:368)
at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:111)
at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:87)
at com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57)
at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:991)
at com.ctc.wstx.sr.BasicStreamReader.readTextSecondary(BasicStreamReader.java:4647)
at com.ctc.wstx.sr.BasicStreamReader.readCoalescedText(BasicStreamReader.java:4147)
at com.ctc.wstx.sr.BasicStreamReader.getElementText(BasicStreamReader.java:679)
at se.lth.cs.nlp.mediawiki.parser.XmlDumpParser.processPages(XmlDumpParser.java:276)
at se.lth.cs.nlp.mediawiki.parser.XmlDumpParser.next(XmlDumpParser.java:337)
at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser$Worker.run(MultistreamBzip2XmlDumpParser.java:345)
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:400)
at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser$ParallelDumpStream.getNext(MultistreamBzip2XmlDumpParser.java:286)
at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser$ParallelDumpStream.read(MultistreamBzip2XmlDumpParser.java:305)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.init(BZip2CompressorInputStream.java:232)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.complete(BZip2CompressorInputStream.java:348)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.initBlock(BZip2CompressorInputStream.java:284)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.setupNoRandPartA(BZip2CompressorInputStream.java:868)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.setupNoRandPartB(BZip2CompressorInputStream.java:917)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.read0(BZip2CompressorInputStream.java:217)
at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.read(BZip2CompressorInputStream.java:172)
at com.ctc.wstx.io.BaseReader.readBytes(BaseReader.java:155)
at com.ctc.wstx.io.UTF8Reader.loadMore(UTF8Reader.java:368)
at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:111)
at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:87)
at com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57)
at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:991)
at com.ctc.wstx.sr.BasicStreamReader.readTextSecondary(BasicStreamReader.java:4647)
at com.ctc.wstx.sr.BasicStreamReader.readCoalescedText(BasicStreamReader.java:4147)
at com.ctc.wstx.sr.BasicStreamReader.getElementText(BasicStreamReader.java:679)
at se.lth.cs.nlp.mediawiki.parser.XmlDumpParser.processPages(XmlDumpParser.java:276)
at se.lth.cs.nlp.mediawiki.parser.XmlDumpParser.next(XmlDumpParser.java:337)
at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser$Worker.run(MultistreamBzip2XmlDumpParser.java:345)

Exception thrown in main

I tried

java -jar wikiforia-1.2.1.jar -pages /home/sudeshna/wikiforia-master/enwiki-20161201-pages-articles-multistream.xml.bz2 -output /home/sudeshna/ -outputformat plain-text

I am getting this exception

Exception in thread "main" java.io.IOError: java.io.IOException: unexpected end of stream
	at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser$PageReader.<init>(MultistreamBzip2XmlDumpParser.java:213)
	at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser.<init>(MultistreamBzip2XmlDumpParser.java:107)
	at se.lth.cs.nlp.wikiforia.App.convert(App.java:303)
	at se.lth.cs.nlp.wikiforia.App.main(App.java:488)
Caused by: java.io.IOException: unexpected end of stream
	at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.bsGetBit(BZip2CompressorInputStream.java:398)
	at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.recvDecodingTables(BZip2CompressorInputStream.java:499)
	at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.getAndMoveToFrontDecode(BZip2CompressorInputStream.java:573)
	at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.initBlock(BZip2CompressorInputStream.java:311)
	at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.<init>(BZip2CompressorInputStream.java:133)
	at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.<init>(BZip2CompressorInputStream.java:109)
	at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser$PageReader.readHeader(MultistreamBzip2XmlDumpParser.java:237)
	at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser$PageReader.<init>(MultistreamBzip2XmlDumpParser.java:211)
	... 3 more

What am I doing wrong?

Thanks!

Missing words and bad hyphenation in french

java -jar target/wikiforia-1.2.1.jar --pages ../frwiki-20150602-pages-articles-multistream.xml.bz2 -lang fr -o xml

interrupt after a couple of minutes since the issue is in the first pages

Example : Amsterdam, id = 245

Le est considéré comme l'âge d'or d'Amsterdam car elle devient à cette époque la ville la plus riche du monde.

should be

Le XVIIe siècle est considéré comme l'âge d'or d'Amsterdam car elle devient à cette époque la ville la plus riche du monde

LAndalousie

LAndalousie

should be

L'Andalousie

null pointer exception executing the tool with bavarian wiki

hey Marcus I tried

git clone
mvn compile
mvn package

so far so good (ok I had a little trouble figuring out that the easiest way to respect external dependencies is switching to the target directory and running from there)

then

cd target
wget https://dumps.wikimedia.org/barwiki/20151002/barwiki-20151002-pages-articles-multistream-index.txt.bz2
wget https://dumps.wikimedia.org/barwiki/20151002/barwiki-20151002-pages-articles-multistream.xml.bz2

when I now run
`java -jar wikiforia-1.2.1.jar -pages barwiki-20151002-pages-articles-multistream.xml.bz2 -output res.xml``

I receive the following output:

[2015-10-14 15:14:55.728 | main | INFO  | se.lth.cs.nlp.wikiforia.App] Wikiforia v1.2.1 by Marcus Klang
Exception in thread "main" java.lang.NullPointerException
    at se.lth.cs.nlp.mediawiki.parser.MultistreamBzip2XmlDumpParser.toString(MultistreamBzip2XmlDumpParser.java:480)
    at se.lth.cs.nlp.wikiforia.Pipeline.run(Pipeline.java:73)
    at se.lth.cs.nlp.wikiforia.App.convert(App.java:239)
    at se.lth.cs.nlp.wikiforia.App.main(App.java:413)

looking at

return String.format("Multistreamed Bzip2 XML Dump parser { \n * Threads: %s, \n * Batch size: %s, \n * Index: %s, \n * Pages: %s, \n * Basepath: %s \n}",

I see that there must be some class fields not initialized but I didn't go into further debugging.

ls shows me that the file res.xml was created so I assume that passing arguments works and something else in the class field is not correctly set.

Did I do something wrong? Is the tool just not working with bavarian wikipedia? comparing git has I found this in git log

commit 04e80b46ecc1bb487419fb9f831258be78413f07
Author: Marcus Klang <[email protected]>
Date:   Tue Mar 24 11:08:08 2015 +0100

    * Added French, German and Spanish configurations

which made me wonder that my dump could be the reason. Thanks for help!
I am not particularly interested in the bavarian wikipedia but I wanted to test the tool with small data (:

best Rene

Maven repository

Hello. Will you deploy your cool project to the global maven repository?
I tried to search the next definition from pom.xml but there are no results:

<groupId>se.lth.cs.nlp</groupId>
<artifactId>wikiforia</artifactId>
<version>1.2.1</version>

Main method return java.nio.file.AccessDeniedException

so i ran the main methode using the following command:
java -jar wikiforia-1.2.1.jar -pages /home/blo/wikiforia-master/enwiki-20170420-pages-articles1.xml-p10p30302.bz2 -output /home/blo/wikiforia-master/wiki-extract/ -outputformat plain-text

but i ran into following error:

Exception in thread "main" java.io.IOError: java.nio.file.AccessDeniedException: /home/blo/wikiforia-master/wiki-extract/
at se.lth.cs.nlp.io.PlainTextWikipediaPageWriter.(PlainTextWikipediaPageWriter.java:46)
at se.lth.cs.nlp.wikiforia.App.getSink(App.java:252)
at se.lth.cs.nlp.wikiforia.App.convert(App.java:305)
at se.lth.cs.nlp.wikiforia.App.main(App.java:488)
at kyo.MyExtractWikiforia.parseWikiDump(MyExtractWikiforia.java:46)
at kyo.MyExtractWikiforia.main(MyExtractWikiforia.java:26)
Caused by: java.nio.file.AccessDeniedException: C:\Users\kyo\git\wikiforia\wikiforia_extracted
at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:83)
at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102)
at sun.nio.fs.WindowsFileSystemProvider.newFileChannel(WindowsFileSystemProvider.java:115)
at java.nio.channels.FileChannel.open(FileChannel.java:287)
at java.nio.channels.FileChannel.open(FileChannel.java:335)
at se.lth.cs.nlp.io.PlainTextWikipediaPageWriter.(PlainTextWikipediaPageWriter.java:44)
... 5 more

Missing apostrophes in french wikipedia conversion to text

Several cases are mishandled in frwiki, not all fortunately. Only when apostrophes are mixed with
markup symbols.
1.

L{{'}}'''Andalousie'''

becomes

LAndalousie

LAndalousie (Andalucía en espagnol, du bas latin ”Vandalucia”

in frwiki.

it should be

L'Andalousie

L''''épistémologie''' (du [[grec ancien]] {{grec ancien|ἐπιστήμη}}

becomes

Lépistémologie

It should be

L'épistémologie

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.