Code Monkey home page Code Monkey logo

gbdt_var's Introduction

gbdt_var

GBDT衍生变量及其应用

衍生GBDT变量

  • get_gbdt_path_var 将GBDT各子树的路径衍生为变量,变量名包含了路径的节点信息,便于回溯
  • get_data_gbdt 回溯GBDT衍生变量给其他数据集(根据各变量取值判断直接回溯,比sklearn的apply和transform更易推广)

规则提取(可用于风控策略或反欺诈的规则)

  • get_head_rule 打印前n个目标占比最高的规则
  • get_rule_df 计算所有规则的覆盖率、目标占比,返回包含这些信息的数据集

当使用GBDT去提取规则时,需注意几个参数,这几个参数的控制是会影响提取的变量的相关性的(比如min_samples_leaf容易剔除相关性强但是覆盖率低的规则)

  1. max_depth 控制每条规则的最多使用变量个数,即一条规则的条件判断不超过max_depth个
  2. min_samples_leaf 控制每条规则的最少样本覆盖率,即一条规则的样本覆盖率不小于min_samples_leaf(float)
  3. n_estimators 综合max_depth控制规则个数,即提取的规则不超过n_estimators*2^(max_depth)个

逻辑回归(传统的GBDT+LR实现)

  • get_lr_model 训练逻辑回归模型,打印并返回模型的截距项、系数、选择变量
  • get_lr_proba 计算特定截距项、系数、选择变量下的逻辑回归模型的预测概率值(结果与lr.predict_proba相同)

gbdt_var's People

Contributors

zglee96 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.