Code Monkey home page Code Monkey logo

klassets's Introduction

klassets

R-CMD-check Github stars

The {klassets} package is a collection of functions to simulate data sets to:

  • Teach how some Statistics Models and Machine Learning algorithms works.
  • Illustrate certain some particular events such as heteroskedasticity or the Simpson’s paradox.
  • Compare the predictions between models, for example logistic regression vs decision tree vs k-Nearest Neighbours.

Some examples

Don’t forget to visualize the data

library(klassets)

set.seed(123)

df <- sim_quasianscombe_set_1(beta0 = 3, beta1 = 0.5)

plot(df) +
  ggplot2::labs(subtitle = "Very similar to the given parameters (3 and 0.5)")

library(patchwork)

df2 <- sim_quasianscombe_set_2(df, fun = sin)
df6 <- sim_quasianscombe_set_6(df, groups = 2, b1_factor = -1)

plot(df2) + plot(df6)

Compare models in a classifications task

df <- sim_response_xy(relationship = function(x, y) sin(x*pi) > sin(y*pi))

df
#> # A tibble: 500 × 3
#>    response       x       y
#>    <fct>      <dbl>   <dbl>
#>  1 FALSE    -0.681   0.707 
#>  2 FALSE    -0.711   0.332 
#>  3 FALSE    -0.702   0.467 
#>  4 TRUE      0.0289 -0.371 
#>  5 TRUE     -0.0143  0.335 
#>  6 TRUE      0.233  -0.0722
#>  7 FALSE    -0.105   0.301 
#>  8 FALSE    -0.889   0.572 
#>  9 FALSE    -0.989   0.803 
#> 10 FALSE    -0.556   0.0548
#> # … with 490 more rows

plot(df)

You can fit different models and see how the predictions are made.

plot(fit_logistic_regression(df, order = 4)) +
plot(fit_classification_tree(df))            +
plot(fit_classification_random_forest(df))   +
plot(fit_knn(df))                            +
  plot_layout(guides = "collect")

How K-means works

Another example of what can be done with {klassets}.

Where to start

You can check:

  • vignette("Quasi-Anscombe-data-sets") to know more about sim_quasianscombe_set* functions family.
  • vignette("Binary-classification")/vignette("Regression") to see classifiers/regression models/methods.
  • vignette("Clustering") to see clustering functions.
  • vignette("MNIST") to work with this data set to compare models and check some variable importance metrics.

Installation

You can install the development version of klassets from GitHub with:

# install.packages("remotes")
remotes::install_github("jbkunst/klassets")

Extra Info(?!)

Why the name Klassets? Just a weird merge for Class/Klass and sets.

Some inspiration and similar ideas:

klassets's People

Contributors

jbkunst avatar snanalyst avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.