Code Monkey home page Code Monkey logo

mixmatch's Introduction

MixMatch

This repository is about steps of this semi-supervised learning algorithm.

This algorithm is originally designed for image classification and usually requires CUDA support. Together with Yang Wan and Minchenxi Zhou, I used this approach and modified the code for tabular data and a CPU-only environment in my internship. I cannot upload code here because of confidentiality requirement, but I can share my understanding of this algorithm because it's public content:)

Data Preparation

  • Categorized as labeled data & unlabeled data
  • Data preprocessing
    • Drop features where over 95% of the data are missing
    • Impute missing values: -1 for int (discrete values), mean for float (continuous values)
    • Construct matrices after dimension seperation
    • The final data should be in a dimension of 4
      • sample size
      • RGB parameter (3 if colored, 1 if grey scale or non-image data)
      • matrix length
      • matrix width

Data Augmentation

  • Augmentation times: 1 for labeled data; K (hyper parameter) for unlabeled data
  • For image data
    • Strong augmentation: sharpening, adjusting saturation, and adjusting color temperature
    • Weak augmentation: translation, rotation, and cropping
  • For tabular data
    • Random flipping and cropping of matrices (substituting the margins with all 0)

Label Guessing

  • Generate pseudo labels for unlabeled data with models. Try multiple times and take the mean as the final result
  • For image data, Wide-ResNet-28 (28 layers of wide residual networks) is commonly used. However, technically other unsupervised learning models could work too, depending on the data format
    • Wide-Res-Net structure

Sharpening Pseudo Labels

  • Minimize entropy and transform the predictions to a one-hot distribution. Pick the one with noticeably largest value

Shuffle

  • Put together X and U as W; then shuffle W
    • X:augmented labeled dataset and their labels
    • U:augmented unlabeled dataset and their pseudo labels
    • hyper-parameter α = 0.75
      • λ ~ Beta(α, α)
      • λ' = max(λ, 1-λ)
      • x' = λ'x1 + (1-λ')x2
      • p' = λ'p1 + (1-λ')p2

MixUp

  • Mix up every element in W
  • Take maximums for λ & 1-λ, so that (x1, p1) still takes up a principal component in the final result after mixup. In this way, X's can still represent labeled data while U's can represent unlabeled ones

Mixed Loss Calculation

  • Calculate cross entropy for labeled data
  • Calculate the L2 distance between predictions and pseudo labels for unlabeled data
  • L = Lx + λuLu
    • hyper-parameter λu = 100

Model Training & Parameter Tuning

  • Train the model with Wide-Res-Net again
  • Tune parameters until the model performs well

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.