Code Monkey home page Code Monkey logo

dpgbdt's Introduction

Differentially Private Gradient Boosted Decision Trees

An implementation of https://arxiv.org/abs/1911.04209.

See example.py for usage instructions.

Dependencies

Make sure the following packages are available to this project:

  • numpy in version 1.20.x
  • pandas in version 1.2.x
  • scikit-learn in version 0.24.x
  • scipy in version 1.6.x

dpgbdt's People

Contributors

ypotdevin avatar giovannt0 avatar kirschte avatar

dpgbdt's Issues

Post process negative loss

In dp_rmse.py, the lines

noise = rng.standard_cauchy()
dp_rmse = cast(float, rmse + 2 * (gamma + 1) * sens * noise / epsilon)

may yield (large) negative values – unreasonable for rMSE loss values.
These may be some viable counter measures:

  1. Simply skip new trees having negative associated loss
  2. Keep track of recent losses to detect outlier (not necessarily just negative) losses, skip those
  3. After encountering a negative loss, clip it to 0 and raise the comparison threshold prev_loss < current_loss according to a (yet to be determined) schedule step by step, so that new trees get a chance to join the ensemble again (which would otherwise be highly unlikely).

Bug: If geometric leaf clipping is deactivated, the value for delta_v is wrong

Relevant section in the C++-Code:

// sensitivity for leaves
if (params->gradient_filtering && !params->leaf_clipping) {
    // you can only "turn off" leaf clipping if GDF is enabled!
    tree_params.delta_v = params->l2_threshold / (1 + params->l2_lambda);
} else {
    tree_params.delta_v = std::min((double) (params->l2_threshold / (1 + params->l2_lambda)),
            2 * params->l2_threshold * pow(1-params->learning_rate, tree_index));
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.