Implications of modeling residuals in multilevel models
Dear Ben,
A question regarding your response is inline below.
### I think it would be fair to call this a location-scale model since
you are modeling both the location (mean) and the scale (SD/variance),
although often people use the term for models where there is a random
effect in both the location and the scale model.
Q: Usually for the scale part of a location-scale model, the linear model
uses a logarithmic link to guarantee that the estimate of scale is positive:
log(scale_i) = a_0 + b_1*x_i1+ ... + b_n*x_ip (for p predictors of scale)
But in the MODEL that I sketched below, how such a guarantee is made?
MODEL <- nlme::lme(y ~ X1_categorical + X2_numeric ...,
random = ~1| subject,
data = data,
correlation = corSymm(~1|subject),
weights = varComb(varIdent(form = ~ 1 | X1_categorical ),
varPower(form = ~ X2_numeric )))
Thanks, Simon
On Fri, Dec 22, 2023 at 12:30?PM Ben Bolker <bbolker at gmail.com> wrote:
This slipped through the cracks. On 2023-12-22 1:10 p.m., Simon Harmel wrote:
Hello All, Just a follow-up, can we say the model I sketched above is a
location-scale
model? Thanks, Simon On Sat, Dec 16, 2023 at 9:30?PM Simon Harmel <sim.harmel at gmail.com>
wrote:
Hello All, I have a highly skewed dataset.
As may have been said before, the *marginal* distribution of your data set (e.g., what you get if you plot hist(y)) is irrelevant/doesn't matter at all (the models make no assumptions about the marginal distribution) But, my MODEL of choice below shows
drastically improved, normally distributed residuals (and predicted
values)
compared to other models whose residuals are not modeled. Three quick questions: 1- Is MODEL below assuming that my data come from a population that
looks
normal once "X1_categorical" and "X2_numeric" are taken into account as modeled in MODEL?
Yes, the *conditional distribution* of y is Gaussian (but not IID because you have specified correlation and variance (weights) structures).
2- Do these normally distributed predictions work better for a subject randomly drawn from a similarly skewed population with a known "X1_categorical" and "X2_numeric"?
Don't think I understand what this means.
3- I think the distribution of residuals mirrors that of the data. If
so,
is it correct to say MODEL below is actually **trimming** my highly
skewed
data as if it was distributed as:
That seems an odd way to put it. I would say that, if the residuals look Normally distributed, that means that the covariates in the model are accounting for the skew. Here's an example of a dataset where the marginal distribution is heavily skewed but the conditional distribution is (exactly) Normal: set.seed(101) x <- rgamma(1000, shape = 1) y <- rnorm(1000, mean = x, sd = 0.1) hist(y) qqnorm(residuals(lm(y~x))) Heteroscedasticity (but not autocorrelation) will also change the distribution of the residuals, but I don't think it's necessary to illustrate the point here. I think it would be fair to call this a location-scale model since you are modeling both the location (mean) and the scale (SD/variance), although often people use the term for models where there is a random effect in both the location and the scale model.
```
hist(resid(MODEL, type = "normalized"))
```
MODEL <- nlme::lme(y ~ X1_categorical + X2_numeric ...,
random = ~1| subject,
data = data,
correlation = corSymm(~1|subject),
weights = varComb(varIdent(form = ~ 1 | X1_categorical ),
varPower(form = ~
X2_numeric )))
Thanks, Simon
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- Dr. Benjamin Bolker Professor, Mathematics & Statistics and Biology, McMaster University Director, School of Computational Science and Engineering (Acting) Graduate chair, Mathematics & Statistics
> E-mail is sent at my convenience; I don't expect replies outside of
working hours.
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models