lme for data that is not normally distributed
For what it's worth, this graph is assessing linearity/heteroscedasticity rather than Normality (you would want a Q-Q plot, not a fitted vs residuals plot, for that). This doesn't look too terrible, but there does seem to be a bit of 'flare' at the large-fitted-value end, which supports Paul's suggestion that you try a log transformation ...
On 16-08-03 03:58 PM, moses selebatso via R-sig-mixed-models wrote:
Thank you both Paul and Alain for your help. You both point out that
I shouldn't test for normality before running a model. I appreciate
that. Paul I have tried you new scripts and, I guess you were right
about experience in visually assessing for normality. Not straight
forward. Below is the plot, for your appreciation. library(lme4)
install.packages("devtools") library(devtools)
devtools::install_github("pcdjohnson/GLMMmisc") library(GLMMmisc)
data<-read.csv("clipboard",sep="\t") m <- lmer(Distance ~ Time + (1 |
ID), data = data) sim.residplot(m) Regards, Moses SELEBATSO Home:
(+267) 318 5219 (H) Mobile: (+267) 716 39370 or (+267) 738
39370"Those who will ALWAYS agree with you may be oppressed by you"
On Wednesday, 3 August 2016, 15:54, Paul Johnson
<paul.johnson at glasgow.ac.uk> wrote:
Hi Moses,
I wouldn?t test normality of residuals ? better to assess them by
eye. I know this sounds ad hoc but given that almost no real
distribution in nature is perfectly normal, the question should be
?how non-normal can the residuals be before seriously harming my
inferences??. This is a more difficult question to answer and
basically requires experience. A test conflates the degree of
non-normality and sample size so a significant result can mean
?quite normal but high n? while a non-significant result can mean
?very non-normal but low n?:
set.seed(1) x <- rpois(1000, 50) hist(x) # looks beautifully normal
shapiro.test(x) # significantly non-normal hist(log(x[1:20])) # looks
pretty bad shapiro.test(log(x[1:20])) # passes the test
Given that your distance response measure is (probably) constrained
to be positive, there?s a good change that it?s right-skewed and
potentially made more normal by log-transformation (if there are no
zero distances).
A good way to visually assess residuals is to plot them against the
fitted values, then compare these to residuals simulated from the
fitted model ? they should look similar, give or take sampling
variation. You can do this with a function I recently wrote called
sim.residplot (available here:
https://github.com/pcdjohnson/GLMMmisc), although you?ll have to
refit your model using lmer in the lme4 package:
library(lme4) library(GLMMmisc) m <- lmer(Distance ~ Time + (1 | ID),
data = data) sim.residplot(m) # repeat a few times to allow for
sampling variation
Good luck, Paul
On 3 Aug 2016, at 14:25, moses selebatso via R-sig-mixed-models <r-sig-mixed-models at r-project.org> wrote: Thank very much for your helpful advice. I ran the model and tested the residuals. They are not normally distributed, and I am still stuck with how I proceed. I tried to copy the output on the email, but I get an error message that the message format cannot sent. Regards, Moses On Wednesday, 3 August 2016, 12:15, Highland Statistics Ltd <highstat at highstat.com> wrote:
Date: Wed, 3 Aug 2016 09:40:20 +0000 (UTC) From: moses selebatso <selebatsom at yahoo.co.uk> To: R-sig-mixed-models <r-sig-mixed-models at r-project.org> Subject: [R-sig-ME] lme for data that is not normally distributed Message-ID: <127496753.15122202.1470217220406.JavaMail.yahoo at mail.yahoo.com> Content-Type: text/plain; charset="UTF-8" ?Hello I have some data that I would to analyse with mixed models (lme). As a standard procedure I tested for the normality of the data and it is not normal. Any ideas of how deals with this kind of data? I have a sample below and the model that I was hoping to use (if?the data?was normal) m <- lme(Distance~Time,random=~1|ID,data=data).
Checking normality of the response variable before doing the analysis is a misconception. Why should it be normally distributed? Fit your model and check your residuals for normality. Alain
| | ID | | Time | | Distance | | | 10187A | | Pre_dry | | 4.31287 | | | 10187A | | Pre_dry | | 6.867578 | | | 10187A | | Pre_dry | | 4.640427 | | | 10187A | | Post_dry | | 4.497807 | | | 10187A | | Post_dry | | 9.726069 | | | 10187A | | Post_dry | | 5.150089 | Regards, Moses SELEBATSO? [[alternative HTML version deleted]] ------------------------------ Subject: Digest Footer
_______________________________________________ R-sig-mixed-models mailing list R-sig-mixed-models at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models ------------------------------ End of R-sig-mixed-models Digest, Vol 116, Issue 4 **************************************************
-- Dr. Alain F. Zuur First author of: 1. Beginner's Guide to GAMM with R (2014). 2. Beginner's Guide to GLM and GLMM with R (2013). 3. Beginner's Guide to GAM with R (2012). 4. Zero Inflated Models and GLMM with R (2012). 5. A Beginner's Guide to R (2009). 6. Mixed effects models and extensions in ecology with R (2009). 7. Analysing Ecological Data (2007). Highland Statistics Ltd. 9 St Clair Wynd UK - AB41 6DZ Newburgh Tel: 0044 1358 788177 Email: highstat at highstat.com URL: www.highstat.com
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models [[alternative HTML version deleted]] _______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models