Code Monkey home page Code Monkey logo

rwa's Introduction

rwa

R build status CodeFactor

Perform a Relative Weights Analysis in R

Background

Relative Weights Analysis (RWA) is a method of calculating relative importance of predictor variables in contributing to an outcome variable. The method implemented by this function is based on Toniandel and LeBreton (2015), but the origin of this specific approach can be traced back to Johnson (2000), A Heuristic Method for Estimating the Relative Weight of Predictor Variables in Multiple Regression. Broadly speaking, RWA belongs to a family of techiques under the broad umbrella 'Relative Importance Analysis', where other members include the 'Shapley method' and 'dominance analysis'. This is often referred to as 'Key Drivers Analysis' within market research.

This package is built around the main function rwa(), which takes in a data frame as an argument and allows you to specify the names of the input variables and the outcome variable as arguments.

The rwa() function in this package is compatible with dplyr / tidyverse style of piping operations to enable cleaner and more readable code.


Installation

You can install the stable CRAN version of rwa with:

install.packages("rwa")

Alternatively, you can install the latest development version from GitHub with:

install.packages("devtools")
devtools::install_github("martinctc/rwa")

Method / Technical details

RWA decomposes the total variance predicted in a regression model (R2) into weights that accurately reflect the proportional contribution of the various predictor variables.

RWA is a useful technique to calculate the relative importance of predictors (independent variables) when independent variables are correlated to each other. It is an alternative to multiple regression technique and it addresses the multicollinearity problem, and also helps to calculate the importance rank of variables. It helps to answer "which variable is the most important and rank variables based on their contribution to R-Square".

See https://link.springer.com/content/pdf/10.1007%2Fs10869-014-9351-z.pdf.

Multicollinearity

When independent variables are correlated, it is difficult to determine the correct prediction power of each variable. Hence, it is difficult to rank them as we are unable to estimate coefficients correctly. Statistically, multicollinearity can increase the standard error of the coefficient estimates and make the estimates very sensitive to minor changes in the model. It means the coefficients are biased and difficult to interpret.

Signs

Key Drivers Analysis methods do not conventionally include a score sign, which can make it difficult to interpret whether a variable is positively or negatively driving the outcome. The applysigns argument in rwa::rwa(), when set to TRUE, allows the application of positive or negative signs to the driver scores to match the signs of the corresponding linear regression coefficients from the model. This feature mimics the solution used in the Q research software. The resulting column is labelled Sign.Rescaled.RelWeight to distinguish from the unsigned column.

Estimating the statistical significance of relative weights

As Tonidandel et al. (2009) noted, there is no default procedure for determining the statistical significance of individual relative weights:

The difficulty in determining the statistical significance of relative weights stems from the fact that the exact (or small sample) sampling distribution of relative weights is unknown.

The paper itself suggests a Monte Carlo method for estimating the statistical significance, but this is currently not available or provided in the package, but the plan is to implement this in the near future.

Basic example

You can pass the raw data directly into rwa(), without having to first compute a correlation matrix. The below example is with mtcars.

Code:

library(rwa)
library(tidyverse)

mtcars %>%
  rwa(outcome = "mpg",
      predictors = c("cyl", "disp", "hp", "gear"),
      applysigns = TRUE)

Results:

$predictors
[1] "cyl"  "disp" "hp"   "gear"

$rsquare
[1] 0.7791896

$result
  Variables Raw.RelWeight Rescaled.RelWeight Sign Sign.Rescaled.RelWeight
1       cyl     0.2284797           29.32274    -               -29.32274
2      disp     0.2221469           28.50999    -               -28.50999
3        hp     0.2321744           29.79691    -               -29.79691
4      gear     0.0963886           12.37037    +                12.37037

$n
[1] 32      

Latest Status

The main rwa() function is ready-to-use, but the intent is to develop additional functions for this package which supplement the use of this function, such as tidying outputs and visualisations.


Contact me

Please feel free to submit suggestions and report bugs: https://github.com/martinctc/rwa/issues

Also check out my website for my other work and packages.

References / Bibliography

Azen, R., & Budescu, D. V. (2003). The dominance analysis approach for comparing predictors in multiple regression. Psychological methods, 8(2), 129.

Budescu, D. V. (1993). Dominance analysis: a new approach to the problem of relative importance of predictors in multiple regression. Psychological bulletin, 114(3), 542.

Grömping, U. (2006). Relative importance for linear regression in R: the package relaimpo. Journal of statistical software, 17(1), 1-27.

Grömping, U. (2009). Variable importance assessment in regression: linear regression versus random forest. The American Statistician, 63(4), 308-319.

Johnson, J. W., & LeBreton, J. M. (2004). History and use of relative importance indices in organizational research. Organizational research methods, 7(3), 238-257.

Lindeman RH, Merenda PF, Gold RZ (1980). Introduction to Bivariate and Multivariate Analysis. Scott, Foresman, Glenview, IL.

Tonidandel, S., & LeBreton, J. M. (2011). Relative importance analysis: A useful supplement to regression analysis. Journal of Business and Psychology, 26(1), 1-9.

Tonidandel, S., LeBreton, J. M., & Johnson, J. W. (2009). Determining the statistical significance of relative weights. Psychological methods, 14(4), 387.

Wang, X., Duverger, P., Bansal, H. S. (2013). Bayesian Inference of Predictors Relative Importance in Linear Regression Model using Dominance Hierarchies. International Journal of Pure and Applied Mathematics, Vol. 88, No. 3, 321-339.

Also see Kovalyshyn for a similar implementation but in Javascript.

rwa's People

Contributors

martinctc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

rwa's Issues

Categorical variables

I have used the R-package rwa with great success when I just have numeric variables. However, I now just tried with a data set with both numeric and categorical variables and then I received an error message.

Error in cor(thedata, use = "pairwise.complete.obs") :
'x' must be numeric

I therefore wonder whether the rwa method may be used for numeric variables only or if there is a way to "convert" the categorical variables to numeric ones that makes the method work even when we have categorical variables?

Data file problem

Hi,

May I ask: In the data file, did you put the correlation matrix in? Or did you put the mean-centered of each variable?
I did not find any specification about the data file and would really appreciate your help. Thank you.

Mai

Polynomial Terms and RWA

Hi all!

I am Ioannis, PhD student in marketing. I am interesting in estimating the relative weights of my variables in the following regression model:

Z ~ Xa + Ya + I(Xa^2) + Xa:Ya

As you can see the model has two variables, the interaction between them and the square of one of the two. When I try to implement in RWA,
I receive the same results as if I had calclulate Xa2 and XaYa before and then use them in the model (i.e., Z ~ Xa + Ya + Xa2 +XaYa). This way I receive the relative importance of each of them.

Yet, my question is whether I can take the relative importance of Xa and Ya overall, since these are the “basic” explanatory units in my case. This means for Xa its relative importance, plus its importance as Xa2 and as its importance in the interaction with Ya (maybe at some level of Ya). If yes, do you have some tips for the code I can use?

Thank you very much for your time!
Ioannis

X not numeric

Hello,

I am getting the error

"Error in cor(thedata, use = "pairwise.complete.obs") :
'x' must be numeric"

Do all categorical variables need to be converted to numbers? Thank you,

Negative eigenvalues - do not get results

I have data set with one Y and 16 predictor variables. By running the rwa function I got not get an output. After looking into the code I get some negative eigenvalues which force the next step sqrt(eigenvalues) to crash.
Can someone please help to fix this issue?

The data is attached.

RWA.xlsx

Confidence Intervals for Weights

This is a request received by email:

I’m using your excelled rwa function in R to extract some relative weights from OLS regressions I’m running. I’m wondering, within your function, is there any way to extract the 95% CIs for each of the weights?

rwa for logistic regression

I have one question regarding the package: I want to calculate relative weights for a logistic regression, so a dichotomous outcome variable. Your rwa() function does not show an error message and puts out some relative weights. However, I am unsure whether your function is also applicable for logistic regression or whether I should not interpret those weights because the function assumes a continuous outcome variable?

Also, it is not entirely clear to me what approach you took to calculate relative weights. For example, how does your package compare to the "lmg" option in the relaimpo package?

RWA Export Result

Thanks for your great work!

I m using this website and have an issue exporting my results. An export option is not displayed. Where you part of the development of this website? and could help me?

95%CI of rescaled weights

I’m using your excellent rwa tools. I’m wondering if I could get the 95% CIs for rescaled weights?

Is rwa superior to multiple regression analysis?

Many thanks for providing a very useful tool! I have a quick question about your rwa package. Statistical software such as SPSS produces the VIF values to tell the multicollinearity of the data. Even when those values are acceptable (like VIF < 10), do you think rwa is superior is
to multiple regression analysis in detecting important predictor variables?

variable scaling?

Hi Martin,

Just wondering if I am supposed to scale the predictor variables prior to running the rwa() function? Is this done automatically or am I misinterpretting how this approach works?

Thanks!

weights and missings

Hi Martin, love the package, but I have 2 questions:

  1. Is it possible to run the analysis with a separate weight variable?
  2. If I'm correct the analysis is standard with listwise deletion of missings. is it also possible to run it with pairwise deletion?
    Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.