Skip to content

Question on mixed model design to analyze microarray data

3 messages · Ben Bolker, Tom Wenseleers

#
Tom Wenseleers <Tom.Wenseleers at ...> writes:
(URL broken, sorry)
I think you probably want a random effect of 

(1|marray) + (1|marray:gene) + (1|marray:pin)+(1|marray:dip)

or equivalently

(1|marray/gene) + (1|marray:pin) + (1|marray:dip)

 you almost certainly _don't_ want (gene|marray), which will try
to fit correlations among gene effects by microarray (i.e a 10K by 10K
correlation matrix), which will explode your computer.  At a quick
read I'm not quite sure what distinction the SAS authors are making
between "X by Y" and "X within Y": I would code "X by Y" as
(1|X) + (1|Y) + (1|X:Y)  [ (1|X*Y) might work, but I haven't tried it ]
and "X within Y" as (1|Y/X) or (1|Y) + (1|Y:X)
Not at present in lmer.  You might be able to code these
random effects in nlme (see the relevant pages of Pinheiro &
Bates 2000, referenced in http://glmm.wikidot.com/faq under
"Crossed random effects", but lme4 is likely to be a lot faster.
1 day later
#
Dear Ben,
Many thanks for your message - after some experimentation I did indeed confirm that the correct specification in lmer was the one you mentioned, i.e. the correct code to get almost identical output as in SAS/glimmix was (as I thought it might be useful to share this for others I attach the datafile, and also the PDF with the original SAS analysis)

microarray=read.csv("microarray_dataset1612_R.csv",colClasses=c(rep("factor",8),"numeric"))
library(lme4)
library(lmerTest)
#options(contrasts = c(factor = "contr.SAS", ordered = "contr.poly")) # not sure if this would be necessary
fit=lmer(log_intensity~(1|array/gene)+(1|array:pin)+(1|array:dip)+dye+trt+gene+dye:gene+trt:gene+pin, data=microarray) 
summary(fit)
anova(fit,ddf="Kenward-Roger")

(the degrees of freedom are slightly different though as in the SAS output, not sure why that is, but the p vals are practically identical)

The only remaining question I now have is how I could obtain the least square means, equivalent to the SAS code
lsmeans trt*gene / slice=gene
slicediff=gene
slicedifftype=control('0' '2');
(cf PDF in attachment)

Would anyone happen to know perhaps how I could obtain these using package lsmeans? (I am new to that package, so apologies if this would happen to be trivial)

Cheers,
Tom


-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Ben Bolker
Sent: 29 June 2013 17:58
To: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Question on mixed model design to analyze microarray data

Tom Wenseleers <Tom.Wenseleers at ...> writes:
(URL broken, sorry)
I think you probably want a random effect of 

(1|marray) + (1|marray:gene) + (1|marray:pin)+(1|marray:dip)

or equivalently

(1|marray/gene) + (1|marray:pin) + (1|marray:dip)

 you almost certainly _don't_ want (gene|marray), which will try to fit correlations among gene effects by microarray (i.e a 10K by 10K correlation matrix), which will explode your computer.  At a quick read I'm not quite sure what distinction the SAS authors are making between "X by Y" and "X within Y": I would code "X by Y" as
(1|X) + (1|Y) + (1|X:Y)  [ (1|X*Y) might work, but I haven't tried it ] and "X within Y" as (1|Y/X) or (1|Y) + (1|Y:X)
Not at present in lmer.  You might be able to code these random effects in nlme (see the relevant pages of Pinheiro & Bates 2000, referenced in http://glmm.wikidot.com/faq under "Crossed random effects", but lme4 is likely to be a lot faster.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-------------- next part --------------
A non-text attachment was scrubbed...
Name: littell et al 2006_microarray example.pdf
Type: application/pdf
Size: 148460 bytes
Desc: littell et al 2006_microarray example.pdf
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20130630/e9c2cba1/attachment-0001.pdf>