recognise the code language using word2vec or word2doc neural network model
- git clone some CPP and Python program and put them in directory ./data/crawl_data/CPP and ./data/crawl_data/Python or execute spider.py file to crawl code file from github automatically
- extend dict file in ./data/dict/dict;put key words in the file dict.raw for every code language
- run the script run.sh and the result will be written in RESULT
- crawl more code data from github or modify spider.py file to crawl data automatically
- collect key words for every language and form dict file