Question on mixed model design to analyze microarray data

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20130629/cbff4376/attachment.pl>
Tom Wenseleers <Tom.Wenseleers at ...> writes:
Dear all,
I was just trying to analyse some microarray gene expression
data using mixed models in R.
Based on the SAS code given here
http://support.sas.com/documentation/cdl/en/
statug/63033/HTML/default/viewer.htm#statug_hpmixed_sect035.htm
(URL broken, sorry)
   proc hpmixed data=microarray;
      class marray dye trt gene pin dip;
      model log2i = dye trt gene dye*gene trt*gene pin;
      random marray marray*gene dip(marray) pin*marray;
      test trt;
   run;

I wanted to try to run the equivalent model in 
R using lmer and then test significance using lmerTest and/or
lsmeans. Am I correct that the correct syntax would
be
lmer(log2i~dye+trt+gene+dye:gene+trt:gene+pin+(1|marray)+
(gene|marray)+(1|marray/dip)+(pin|marray), data=microarray)
(gene has about 10 000 levels and marray 10, would this run in lmer?)
I think you probably want a random effect of 

(1|marray) + (1|marray:gene) + (1|marray:pin)+(1|marray:dip)

or equivalently

(1|marray/gene) + (1|marray:pin) + (1|marray:dip)

 you almost certainly _don't_ want (gene|marray), which will try
to fit correlations among gene effects by microarray (i.e a 10K by 10K
correlation matrix), which will explode your computer.  At a quick
read I'm not quite sure what distinction the SAS authors are making
between "X by Y" and "X within Y": I would code "X by Y" as
(1|X) + (1|Y) + (1|X:Y)  [ (1|X*Y) might work, but I haven't tried it ]
and "X within Y" as (1|Y/X) or (1|Y) + (1|Y:X)
Also, would it be possible in any way to allow 
variances across treatment (trt) groups to be unequal? Or is
that only possible in nlme/lme?
(but that wouldn't allow for such complex random effect designs, right?)
Not at present in lmer.  You might be able to code these
random effects in nlme (see the relevant pages of Pinheiro &
Bates 2000, referenced in http://glmm.wikidot.com/faq under
"Crossed random effects", but lme4 is likely to be a lot faster.
Dear Ben,
Many thanks for your message - after some experimentation I did indeed confirm that the correct specification in lmer was the one you mentioned, i.e. the correct code to get almost identical output as in SAS/glimmix was (as I thought it might be useful to share this for others I attach the datafile, and also the PDF with the original SAS analysis)

microarray=read.csv("microarray_dataset1612_R.csv",colClasses=c(rep("factor",8),"numeric"))
library(lme4)
library(lmerTest)
#options(contrasts = c(factor = "contr.SAS", ordered = "contr.poly")) # not sure if this would be necessary
fit=lmer(log_intensity~(1|array/gene)+(1|array:pin)+(1|array:dip)+dye+trt+gene+dye:gene+trt:gene+pin, data=microarray) 
summary(fit)
anova(fit,ddf="Kenward-Roger")

(the degrees of freedom are slightly different though as in the SAS output, not sure why that is, but the p vals are practically identical)

The only remaining question I now have is how I could obtain the least square means, equivalent to the SAS code
lsmeans trt*gene / slice=gene
slicediff=gene
slicedifftype=control('0' '2');
(cf PDF in attachment)

Would anyone happen to know perhaps how I could obtain these using package lsmeans? (I am new to that package, so apologies if this would happen to be trivial)

Cheers,
Tom

-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Ben Bolker
Sent: 29 June 2013 17:58
To: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Question on mixed model design to analyze microarray data

Tom Wenseleers <Tom.Wenseleers at ...> writes:
Dear all,
I was just trying to analyse some microarray gene expression data 
using mixed models in R.
Based on the SAS code given here
http://support.sas.com/documentation/cdl/en/
statug/63033/HTML/default/viewer.htm#statug_hpmixed_sect035.htm
(URL broken, sorry)
   proc hpmixed data=microarray;
      class marray dye trt gene pin dip;
      model log2i = dye trt gene dye*gene trt*gene pin;
      random marray marray*gene dip(marray) pin*marray;
      test trt;
   run;

I wanted to try to run the equivalent model in R using lmer and then 
test significance using lmerTest and/or lsmeans. Am I correct that the 
correct syntax would be 
lmer(log2i~dye+trt+gene+dye:gene+trt:gene+pin+(1|marray)+
(gene|marray)+(1|marray/dip)+(pin|marray), data=microarray) (gene has 
about 10 000 levels and marray 10, would this run in lmer?)
I think you probably want a random effect of 

(1|marray) + (1|marray:gene) + (1|marray:pin)+(1|marray:dip)

or equivalently

(1|marray/gene) + (1|marray:pin) + (1|marray:dip)

 you almost certainly _don't_ want (gene|marray), which will try to fit correlations among gene effects by microarray (i.e a 10K by 10K correlation matrix), which will explode your computer.  At a quick read I'm not quite sure what distinction the SAS authors are making between "X by Y" and "X within Y": I would code "X by Y" as
(1|X) + (1|Y) + (1|X:Y)  [ (1|X*Y) might work, but I haven't tried it ] and "X within Y" as (1|Y/X) or (1|Y) + (1|Y:X)
Also, would it be possible in any way to allow variances across 
treatment (trt) groups to be unequal? Or is that only possible in 
nlme/lme?
(but that wouldn't allow for such complex random effect designs, 
right?)
Not at present in lmer.  You might be able to code these random effects in nlme (see the relevant pages of Pinheiro & Bates 2000, referenced in http://glmm.wikidot.com/faq under "Crossed random effects", but lme4 is likely to be a lot faster.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-------------- next part --------------
A non-text attachment was scrubbed...
Name: littell et al 2006_microarray example.pdf
Type: application/pdf
Size: 148460 bytes
Desc: littell et al 2006_microarray example.pdf
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20130630/e9c2cba1/attachment-0001.pdf>