Comments (6)
It could be a problem if you have many variable that need to handle in this way.
I'll add this feature into woebin
function. Thanks.
from scorecard.
Thanks. You can specify the breakpoints via option break_list
in the function woebin
. And you can get the optimal binning based on the dataset that excludes the two special values.
library(scorecard) library(data.table) dat<-data.table( y=c(0,0,0,1,1,1,1,1,0,0,1,1,1,0,0,0,1,1,1,0), x=c(1,2,3,4,5,888,888,888,9,10,666,666,666,666,15,16,17,18,19,20))
# get optimal breakpoints for rest dataset
bins <- woebin(dat[x != 666 & x != 888], "y")
# specify the breakpoints
bins2 <- woebin(dat, "y", breaks_list = list(x=c(16, 18, 666, 888)))
woebin_plot(bins2)
from scorecard.
Thanks. very detailed answer. In my case, i mean the value '666' and '888' is a categorical variable. so we should convert it as a factor before woebin.
wheile,the following code regard '666' and '888' as numerical variable.
bins2 <- woebin(dat, "y", breaks_list = list(x=c(16, 18, 666, 888)))
in my opinion, i can do following
library(scorecard)
library(data.table)
library(dplyr)
dat<-data.table( y=c(0,0,0,1,1,0,0,1,0,0,1,1,1,0,0,0,1,1,1,0), x=c(1,2,3,4,5,888,888,888,9,10,666,666,666,666,15,16,17,18,19,20))
special value
sp<-c(666,888)
#get the special data
dat_sp<-filter(dat, x %in% sp)
#get normal data
dat_nor<-filter(dat, !x %in% sp)
convert it to factor
dat_sp$x<-as.factor(dat_sp$x)
bins_sp <- woebin(dat_sp, "y")
woebin_plot(bins_sp)
bin for normal data
bins_nor <- woebin(dat_nor, "y")
woebin_plot(bins_nor)
and now the question is 1) how to combine these two plot in one plot. 2) how to combine these two woe for the variable in this case becase we can't do it by rbind function simply. if we have many such variable , how to get woe ? what I really warried is that we can't do bin for many variables automatically. for example,many functions in your package support batch process, obviously,if we bin for special value and normal value respectively,it destroies batch process.
from scorecard.
get optimal breakpoints for rest dataset
bins <- woebin(dat[x != 666 & x != 888], "y")
here, if we could really get the optimal breakponits once we omit some special samples?
and i realized that compute woe separately is wrong.
from scorecard.
Thanks again for your nice solution! Looking forward to your improved version.
from scorecard.
see the following example:
library(scorecard)
dat <- data.frame(y=c(0,0,0,1,1,1,1,1,0,0,1,1,1,0,0,0,1,1,1,0),
x=c(1,2,3,4,5,888,888,888,9,10,666,666,666,666,15,16,17,18,19,20))
#' specify two values as two class
bin = woebin(dat, "y", special_values = c(666,888))
#' specify two values as one class
bin2 = woebin(dat, "y", special_values = c("666%,%888"))
from scorecard.
Related Issues (20)
- Formulas HOT 1
- 关于woebin等频分箱报错 HOT 2
- Gini with to = 'bin' HOT 1
- Scorecard2 issue with probability set to TRUE HOT 2
- question min and max score HOT 1
- Information Value from scorecard::iv() is not equal to Information value from scorecard::woebin() HOT 3
- 分箱区间问题 HOT 3
- woebin 指定breaklist时有问题 HOT 13
- Line plot for woebin_adj with line_value = "woe" resets to positive probability after adjusting breaks HOT 5
- Native pipe |> requires R >=4.1 HOT 1
- Cannot install.packages("scorecard") on windows HOT 2
- Fail to install - 0.3.9 HOT 2
- Definition of offset in the scorecard function HOT 1
- Is there any way to export the scorecard to PMML? HOT 8
- Error after latest update HOT 1
- Woe and points do not follow the same pattern HOT 2
- Function error HOT 5
- Let we choose whether to let the missing value be a separate bin HOT 3
- woebin持续运算得不到结果 HOT 2
- woebin bug (?) HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scorecard.