Code Monkey home page Code Monkey logo

mcs_kfold's Introduction

mcs_kfold

mcs_kfold stands for "monte carlo stratified k fold". This library attempts to achieve equal distribution of discrete/categorical variables in all folds. Internally, the seed is changed and stratified k-fold trials are repeated to find the seed with the least entropy in the distribution of the specified variables. The greatest advantage of this method is that it can be applied to multi-dimensional targets.

Usage

from mcs_kfold import MCSKFold
mcskf = MCSKFold(n_splits=num_cv, shuffle_mc=True, max_iter=100)

for fold, (train_idx, valid_idx) in enumerate(
    mcskf.split(df=df, target_cols=["Survived", "Pclass", "Sex"])
):
    .
    .
    .

see also example for further information.

histograms shown below is generated with this library with Kaggle Titanic: Machine Learning from Disaster data. you can see here that three target variables are equally distributed over five folds.

fold 0

image

fold 1

image

fold 2

image

fold 3

image

fold 4

image

Install

pip

pip install mcs_kfold

Install newest version

git clone https://github.com/MasashiSode/mcs_kfold
cd mcs_kfold
pip install .

Develop

poetry install

Test

pytest

mcs_kfold's People

Contributors

masashisode avatar wakame1367 avatar yujiariyasu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.