I am running a GLMM in R in lme4 package, the outcome variable is binary and the 10 fixed effects are a mix of categorical and continuous variables. The models have three random-effects.
I am using DHARMa to check for the GLMM assumptions.
simulateResiduals(fittedModel = cm5, asFactor=T, plot = T, quantreg=T,1000)
It doesn’t show that I have big misspecification problems however the residuals are not uniform, and the KS-test and the dispersion test is significant.
I recalculate the residuals at each random effect levels and gave the same issues.
I ran the same model but this time I categorised all the continuous fixed effects, the DRAHMa output is much better and better meet the assumptions.
I am not a fan of categorising continuous variables and I don’t want to lose information to meet the assumptions. But at the same time, I don’t want biased estimates because of not meeting the assumptions.
Please advise, which option to sacrifice.
I'm the developer of DHARMa - from what I see, the deviations are minimal, so I would not be worried about the significant p-values. You just have a lot of data, so even a minimal deviation will become significant (this is also discussed in the help / vignette).
What would be more interesting is probably to plot residuals agains predictors. If you see no patterns there, you should be fine.
Answered by Florian Hartig on November 21, 2021
Get help from others!