Code Monkey home page Code Monkey logo

question-tagging's Introduction

Tagging Stack-Overflow questions with deep neural networks

Project idea from Awesome Deep Learning Project Ideas

Data

The training data set was originally downloaded from "StackLite: Stack Overflow questions and tags" and is formally licenced by Stack Exchange, Inc. under cc-by-sa 3.0. It contains the question score and answer count as well as the anonymous ID of its owner. The neural net tries to map this vector to one of the 50 frequently used question tags like java, c++ or html.

Neural network

The neural net was implemented as computational graph with the popular machine learning library TensorFlow. You can find my model in the following Python module: src/model.py. The below picture shows the network architecture. It consists of four hidden layers with 10, 12, 24 and 48 neurons, where each neuron has a ReLU activation. Further neural nets output layer holds one neuron for each question tag and applies Softmax function to their activation for classification purpose. net

Train model

To train model, simply run the following command in the root folder of this project. Therefore Python 3 is recommended and Googles TensorFlow and matplotlib are required.

$ python src/model.py

Results

This models reaches an accuracy of over 85% for the train and test data set after 20000 iterations of training. The below picture shows models loss in relation to its training epochs. Under data/trained_models you can find this neural net as pre trained model with its adjusted weights and biases. Use TensorFlows tf.train.Saver to load this model and make your own predictions against this tagged questions.

Drawing

question-tagging's People

Contributors

erohkohl avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.