Code Monkey home page Code Monkey logo

Comments (5)

florianhartig avatar florianhartig commented on August 16, 2024

The key to this is to understand that each single residual is essentially a p-value, and thus uniformly distributed under H0.

If we expect uniform distribution for EACH residual, we expect also that

  1. ALL residuals are uniformly distributed
  2. Any SUBSET of residuals is uniformly distributed

This is what is essentially tested in the standard DHARMa plots - the left plot shows you the joint distribution of all residuals, the right plots shows residuals ordered against a predictor, and if we group residuals according to a particular value of the predictor, they should still be uniform (which is what the second statement you cite refers to)

image

from dharma.

tqdo avatar tqdo commented on August 16, 2024

thanks

from dharma.

tqdo avatar tqdo commented on August 16, 2024

Another related question if you don't mind:

My understanding from your answer is if the model is not fitted correctly, the residuals will not follow a uniform distribution. I did an experiment in which I intentionally omitted a feature that was used to generate y during training. What I observed was: the residuals are very non-uniform when plotted against that missing feature, but the residuals appear to be almost uniform when plotted against a random unrelated feature. This intuitively makes sense to me (the plot suggests that the missing feature can help us explain the response while the random unrelated feature has no value) but I don't get why the residuals would appear to be uniform for that unrelated random feature?

Code in R and plots

`
set.seed(666)
library(DHARMa)

x1 = rnorm(1000)
x2 = rnorm(1000)
z = 1 + 2x1 + 3x2 pr = 1/(1+exp(-z))
y = rbinom(1000,1,pr)
df = data.frame(y=y,x1=x1,x2=x2)

#Omit x2 during training
fittedModel = glm( y~x1,data=df,family="binomial")
simulationOutput <- simulateResiduals(fittedModel = fittedModel, plot = F)

#Strong deviations from uniformity
plotResiduals(simulationOutput, x2)
Screenshot 2023-01-24 at 4 31 49 PM

#Minimal deviations from uniformity
plotResiduals(simulationOutput, runif(1000))
Screenshot 2023-01-24 at 4 31 58 PM

`

from dharma.

florianhartig avatar florianhartig commented on August 16, 2024

What I state is an implication for H0, so H0 => i.i.d.uniform residuals. From that, it does not follow that !H0 => not uniform, so uniform residuals are not a guarantee that the model is correct, but if you see non-uniformity, you know that that something is wrong. This is the reason why there are so many different plots / tests.

All this is, however, the same for all residual checks - in an OLS, you can also have a perfect QQ plot and then you see a pattern in residual ~ predictor.

So, what you are doing with the residual checks is to perform a number of sanity checks on your model, but that doesn't guarantee that it is correct.

from dharma.

florianhartig avatar florianhartig commented on August 16, 2024

See also the section on interpreting residuals in the vignette https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html#interpreting-residuals-and-recognizing-misspecification-problems

from dharma.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.