Code Monkey home page Code Monkey logo

binary_svm_mtl's Introduction

Binary_SVM_MTL

The code applies three linear svm based binary classification methods, i.e. global, local and multi-task learning, on a data set that has been decomposed into multiple tasks.

  • The global method learns a global model for the whole data
  • The local method learns a model for each task seperately
  • The MTL method assumes a multi-task relationship between the models and learns seperate models for each task with a regularizer that promotes the relationship among the tasks (see our paper for more details: https://arxiv.org/pdf/1705.10467.pdf)

The general idea of multi-task learning is to utilize the relationship between the tasks in order to boost the effective sample size. Therefore, MTL is usually expected to have a better performance than global and local when:

  • The tasks are different enough such that having seperate models helps
  • The tasks are related so that MTL can exploit such relationship
  • The number of data points for each task is not very large and the localy learned model does not generalize well.

The input to the code should be a set of csv files. assuming there are m tasks, the number of csv files is 2m. For each task there should be two corresponding csv files. They should be named features_ and labels_ followed by the task number. The tasks should be numbered from 1 to m. For each task t, the features file includes a N_txd matrix of features, where N_t is the number of data points in task t and d is the feature length. The labels file for task t includes an N_tx1 vector of -1 and +1 which is the corresponding labels for the N_t data points in task t.

Based on what we mentioned earlier, the relative performance of the different methods is substantially dependent on the number of training points. You can control this number by changing the test_perc variable which indicates the percentage of data points used for testing. By default, we use a cv=5 fold cross validation with 10 trials to evaluate each method. We do a grid search over hyper-params to find the best set of parameters.

Finally it shows the average score for each of the tasks as well as the avergae and std of the overall score for all tasks for all emthods. In order to define the overall score we simply average the score of the method over different tasks.

binary_svm_mtl's People

Contributors

maziars avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.