Code Monkey home page Code Monkey logo

gcrf_gui_tool's Introduction

GCRFs tool

Structured regression models are designed to use relationships between objects for predicting output variables. In other words, structured regression models consider the attributes of objects and dependencies between the objects to make predictions.

The Gaussian Conditional Random Fields (GCRF) model is one type of structured regression models that incorporate the outputs of unstructured predictors (based on the given attributes values) and the correlation between output variables in order to achieve a higher prediction accuracy.

The GCRFs tool is an open-source software that integrates vari-ous GCRF methods and supports training and testing of those methods on synthetic and real-world datasets.

Installation and User manual

Download zip file from http://gcrfs-tool.com/.
Extract zip to the desired location. The extracted folder GCRFsTOOL contains the executable file gui.jar.

User manual is available here.

Technical requirements

  • Java 8
  • Matlab (optional, if you want to test following methods: UmGCRF, m-GCRF, up-GCRF, and RLSR)

Methods

Structured regression methods:

  • Gaussian Conditional Random Fields (GCRF) is structured regression model that incorporates the outputs of unstructuredpredictors (based on the given attributes values) and the correlation between output vari-ables in order to achieve a higher prediction accuracy.

  • Directed GCRF (DirGCRF) extends the GCRF method byconsidering asymmetric similarity (directed graphs).

  • Unimodal GCRF (UmGCRF) extends the GCRF parameter spaceto allow negative influences and improves computational efficiency.

  • Marginalized GCRF (m-GCRF) extends GCRF to naturallyhandle missing labels, rather than expecting the missing data to be treated in a preprocess-ing stage.

  • Uncertainty Propagation GCRF (up-GCRF) takes into ac-count uncertainty that comes from the data when estimating uncertainty of the predictions.

  • Representation Learning based Structured Regression (RLSR) simultaneously learns hidden representation of objects and relationships among outputs.

Unstructured predictors:

  • Neural networks
  • Linear Regression
  • Multivariate Linear Regression

Datasets

Tool provides 7 dataset samples.

  • Geostep Asymmetric:
    • Nodes: treasure hunt games - 25 for train, 25 for test <\li>
    • Network: similarity between games, based on common number of clues
    • Attributes: 6 - the number of clues in each category (business, social, travel, and irrelevant), game privacy scope, and game duration
    • Goal: predict probability that the game can be used for touristic purposes
    • Note:Linear regression cannot be applied to this data.
  • Geostep Symmetric:
    • Similarity is converted from asymmetric to symmetric
    • Note: Linear regression cannot be applied to this data.
  • Teen Asymmetric 1 x:
    • Nodes: 50 teenagers
    • Network: friendship network - teenagers were asked to identify up to 12 best friends
    • Attribute: teenager's alcohol consumption (ranging from 1 to 5) in previous time point
    • Goal: predict alcohol consumption at the observation time point
  • Teen Asymmetric 3 x:
    • Nodes: 50 teenagers
    • Network: friendship network - teenagers were asked to identify up to 12 best friends
    • Attributes: teenager's alcohol consumption (ranging from 1 to 5) in three previous time points
    • Goal: predict alcohol consumption at the observation time point
  • Energy RLSR:
    • Nodes: 10
    • Network: no network (method should learn similarity)
    • Attribute: 1
    • Goal: predict daily solar energy income
    • Time points: 1600
  • Rain up-GCRF:
    • Nodes: 100
    • Network: no network (method should learn similarity)
    • Attribute: 2
    • Goal: predict rainfall
    • Time points: 708
  • Random m-GCRF:
    • Randomly generated data
    • Nodes: 20
    • Attributes: 3
    • Time points: 3

Users can add their own dataset using Add dataset option in Datasets menu item.

Accuracy measure

R2 coefficient of determination is used to calculate the regression accuracy of all methods. R2 measures how closely the output of the model matches the actual value of the data. A score of 0 indicates a very poor matching, while a score of 1 indicates a perfect match.

References

  • GCRF: Vladan Radosavljevic, Slobodan Vucetic, and Zoran Obradovic. Continuous conditional random fields for regression in remote sensing. In Proccedings of European Conference on Artificial Intelligence (ECAI), pages 809–814, 2010 (PDF)
  • DirGCRF: Tijana Vujicic, Jesse Glass, Fang Zhou, and Zoran Obradovic. Gaussian conditional random fields extended for directed graphs. Machine Learning, 106(9-10):1271–1288, 2017 (PDF)
  • UmGCRF: Jesse Glass, Mohamed F Ghalwash, Milan Vukicevic, and Zoran Obradovic. Extending the modelling capacity of Gaussian conditional random fields while learning faster. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16), pages1596–1602, 2016 (PDF)
  • RLSR: Chao Han, Shanshan Zhang, Mohamed Ghalwash, Slobodan Vucetic, and Zoran Obradovic.Joint learning of representation and structure for sparse regression on graphs. In Proceedings of the SIAM International Conference on Data Mining, pages 846–854, 2016 (PDF)
  • up-GCRF: Djordje Gligorijevic, Jelena Stojanovic, and Zoran Obradovic. Uncertainty propagation inlong-term structured regression on evolving networks. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16), pages 1603–1609, 2016 (PDF)
  • m-GCRF: Jelena Stojanovic, Milos Jovanovic, Djordje Gligorijevic, and Zoran Obradovic. Semi-supervised learning for structured regression on partially observed attributed graphs. In Proceedings of the SIAM International Conference on Data Mining, pages 217–225, 2015 (PDF)

Efficiency

The scalability of the tool and the running behaviour of different methods were assessed on different datasets with varying numbers of nodes: 100, 500, 1000 and 5000. All experiments were run on Windows with 16GB RAM memory and 3.4GHz CPU. The time consumption is presented after 50 iterations and the results are shown in the table below.

No. of nodes No. of edges GCRF DirGCRF UmGRCRF m-GRCRF up-GRCRF RLSR
100 5,094 0.27 s 0.17 s 6.62 s 8.49 s 26.84 s 69.25 s
500 127,540 16.98 s 9.49 s 7.45 s 17 s 6.58 min 8.25 min
1000 509,376 129.4 s 69.57 s 8 s 53.15 s 27.6 min 1h 18 min
5000 12,749,518 4h 45 min 2h 12 min 34.48 s 65 min N/A N/A

External libraries

  • OjAlgo (oj! Algorithms) for matrix calculations
  • Neuroph for neural networks implementation
  • Matlabcontrol for calling Matlab from Java

Contibutors

  • Tijana (Vujicic) Markovic
  • Vladan Devedzic
  • Fang Zhou
  • Zoran Obradovic
  • Jesse Glass
  • Jelena Stojanovic
  • Djordje Gligorijevic
  • Chao Han
  • Ivan Knezevic
  • Petar Radunovic

Questionnaire for tool evaluation

The main goal of GCRFs tool is to provide straightforward and user-friendly graphical user interface that will simplify the use of GCRF methods for expert and non-expert users. In order to get a detailed insight into the users’ experiences and opinions, we have created a questionnaire for tool evaluation.

Please fill out the questionnaire, we are opened for your opinions and suggestions.

gcrf_gui_tool's People

Contributors

vujicictijana avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.