Y is a brain measure that has been standardized. A histogram of Y is here:
http://imgur.com/Um8yyuu
I am confused about the "Y must be non-negative and the dataset
contains observations close to 0" part. Is that the requirements for
Y? Is so, then my model could be wrong.
On Wed, Oct 7, 2015 at 10:15 AM, Thierry Onkelinx
<thierry.onkelinx at inbo.be> wrote:
Can you elaborate on what Y is? Does it has a lower boundary? And if so,
you have observations near that boundary? E.g. Y must be non-negative and
the dataset contains observations close to 0. A densityplot would be
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
To call in the statistician after the experiment is done may be no more
asking him to perform a post-mortem examination: he may be able to say
the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
~ John Tukey
2015-10-07 17:09 GMT+02:00 Yizhou Ma <maxxx848 at umn.edu>:
Hi Thierry,
Thank you for your reply and sorry for the HTML thing. Below is my
summary(model) output.
Y, Drink, and Age are continuous variables
Gender is F & M.
Family_ID is a factor.
Linear mixed model fit by maximum likelihood ['lmerMod']
Formula: Y ~ Drink * Gender + Age + (1 | Family_ID)
Data: data
AIC BIC logLik deviance df.resid
1046.4 1074.0 -516.2 1032.4 372
Scaled residuals:
Min 1Q Median 3Q Max
-2.67228 -0.56085 -0.02968 0.66166 2.91452
Random effects:
Groups Name Variance Std.Dev.
Family_ID (Intercept) 0.3550 0.5958
Residual 0.6162 0.7850
Number of obs: 379, groups: Family_ID, 189
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.10309 0.43921 2.511
Drink 0.16425 0.08031 2.045
Gender.M -0.19364 0.10874 -1.781
Age -0.03377 0.01489 -2.268
Drink:Gender.M -0.13647 0.10681 -1.278
Correlation of Fixed Effects:
(Intr) Drnk Gndr.M Age
Drink -0.098
Gender.M -0.040 -0.249
Age -0.985 0.158 -0.054
Drnk:G.M 0.042 -0.737 -0.021 -0.085
Thank you very much,
Cherry
On Wed, Oct 7, 2015 at 5:14 AM, Thierry Onkelinx
<thierry.onkelinx at inbo.be> wrote:
Dear Cherry,
Please don't post in HTML. Have a look at the posting guide.
You'll need to provide more information. What is the class of each
variable
(continuous, count, presence/absence, factor, ...)? What is the output
of
summary(model)?
Best regards,
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
To call in the statistician after the experiment is done may be no
than
asking him to perform a post-mortem examination: he may be able to say
what
the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
2015-10-06 17:15 GMT+02:00 Yizhou Ma <maxxx848 at umn.edu>:
Dear LMM experts:
I am pretty new to using LMM and I have found the following situation
bewildering as I was trying to do diagnostics with my fitted model:
conditional residuals correlated highly with the fitted values.
I have a dataset with multiple families, each has 1-4 siblings. I am
trying
to regress Y onto EVs include Drink, Gender, & Age, while using
intercept for family. This is the model I used:
model<-lmer(Y~Drink*Gender+Age
+(1|Family_ID),data,REML=FALSE)
After fitting the model, I used
plot(model)
to see the relationship between conditional residuals and fitted
values. I
expect them to be uncorrelated and I expect to see homoscedasticity.
Yet to my surprise there is a high correlation (~0.5) between the
residuals
and the fitted values. (see here <http://imgur.com/pPsG4aR>). I know
from
GLM that this usually suggest nonlinear relationships between the EVs
and
the DV.
I read some online posts (post1
<
that suggest this can result from a poor model fit. So I tried a few
different models, including: 1) log transform Drink, which is
originally
positively skewed; 2) add random slopes for Drink, Age, etc. None of
these
changes have led to a substantial difference for the residual &
value correlation.
Some other info:
1) my overall model fit is not poor as indicated by the correlation
between
fitted values & Y. It is around 0.8;
2) most variables in my model has a normal, or at least symmetrical,
distribution.
3) conditional residuals are normally distributed as shown in
4) conditional residuals are not correlated with any fixed effects,
such
as
Drink or Age.
I have two guesses as to what is going on:
1) maybe the fact that each family is a different size actually
violates
assumptions of the model?
2) or maybe there is something wrong with estimation of the random
effect
(family intercept)?
I'd really appreciate your insights as to what is going on here and
there is any problems with my model.
Thank you very much,
Cherry
[[alternative HTML version deleted]]