I am reading the package's vignettes (<a href="https://cran.r-project.org/web/packages

See also the section on interpreting residuals in the vignette <a href="https://cran.r

Why uniformity in y direction if we plot against any predictor? about dharma HOT 5 CLOSED

tqdo commented on August 16, 2024

Why uniformity in y direction if we plot against any predictor?

from dharma.

Comments (5)

florianhartig commented on August 16, 2024

The key to this is to understand that each single residual is essentially a p-value, and thus uniformly distributed under H0.

If we expect uniform distribution for EACH residual, we expect also that

ALL residuals are uniformly distributed
Any SUBSET of residuals is uniformly distributed

This is what is essentially tested in the standard DHARMa plots - the left plot shows you the joint distribution of all residuals, the right plots shows residuals ordered against a predictor, and if we group residuals according to a particular value of the predictor, they should still be uniform (which is what the second statement you cite refers to)

from dharma.

tqdo commented on August 16, 2024

thanks

from dharma.

tqdo commented on August 16, 2024

Another related question if you don't mind:

My understanding from your answer is if the model is not fitted correctly, the residuals will not follow a uniform distribution. I did an experiment in which I intentionally omitted a feature that was used to generate y during training. What I observed was: the residuals are very non-uniform when plotted against that missing feature, but the residuals appear to be almost uniform when plotted against a random unrelated feature. This intuitively makes sense to me (the plot suggests that the missing feature can help us explain the response while the random unrelated feature has no value) but I don't get why the residuals would appear to be uniform for that unrelated random feature?

Code in R and plots

`
set.seed(666)
library(DHARMa)

x1 = rnorm(1000)
x2 = rnorm(1000)
z = 1 + 2x1 + 3x2 pr = 1/(1+exp(-z))
y = rbinom(1000,1,pr)
df = data.frame(y=y,x1=x1,x2=x2)

#Omit x2 during training
fittedModel = glm( y~x1,data=df,family="binomial")
simulationOutput <- simulateResiduals(fittedModel = fittedModel, plot = F)

#Strong deviations from uniformity
plotResiduals(simulationOutput, x2)

#Minimal deviations from uniformity
plotResiduals(simulationOutput, runif(1000))

from dharma.

florianhartig commented on August 16, 2024

What I state is an implication for H0, so H0 => i.i.d.uniform residuals. From that, it does not follow that !H0 => not uniform, so uniform residuals are not a guarantee that the model is correct, but if you see non-uniformity, you know that that something is wrong. This is the reason why there are so many different plots / tests.

All this is, however, the same for all residual checks - in an OLS, you can also have a perfect QQ plot and then you see a pattern in residual ~ predictor.

So, what you are doing with the residual checks is to perform a number of sanity checks on your model, but that doesn't guarantee that it is correct.

from dharma.

florianhartig commented on August 16, 2024

See also the section on interpreting residuals in the vignette https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html#interpreting-residuals-and-recognizing-misspecification-problems

from dharma.

Why uniformity in y direction if we plot against any predictor? about dharma HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent