need help with mixed effects model

Doug and other mixed-models aficionados,

I have made some progress on my own on the problem I posted in this
thread. Briefly, I am analyzing a multifactoral genomic experiment and
wish to look at gene-gene correlations independent of Strain. Because
multiple measurements are taken per rat, I wish to use lmer. What  
seems
to be working is the following.

mod1 <- lmer(gene2 ~ -1 + Strain + (1|Rat) + gene1)
mod2 <- lmer(gene2 ~ -1 + Strain + (1|Rat))
anova.sum <- anova(mod1, mod2)

I look to see if adding the expression of the other gene of interest  
as
a covariate significantly improves the model, if it does, then I take
that as an indicator of gene-gene correlation/dependence.

The concern that Doug had is I assume that gene1 and gene2 are both  
measured with error, and this type of model assumes that the  
covariates are measured without error or for practical purposes much  
lower than the error in the dependent variable. Ignoring this problem  
biases the coefficients towards zero with consequent loss of power. I  
don't have any idea how important this is, it all depends on the error  
of your measurements. The usual solution is structural equation  
modelling (SEM). This is something I haven't tried, so I have no idea  
how easy or how well it will work.

Ken
I am not doing this, of course for just two genes, but build an
adjacency matrix out of the p-values for all gene-gene interactions  
in a
list of about 400 sig. genes. I then adjust the p-values for FDR and
pick a suitable FDR (0.001 in this case) as a threshold and create
another adjacency matrix with 1's for significant correlation and 0's
for non-significant. I then visualize this using Rgraphviz.

As I was tearing my hair out trying to make sure this was sensical, it
occurred to me that within my list of 400 genes I have positive
controls. About 40 of the genes are represented by 2 or more  
probesets,
which should be highly correlated if they are measuring the same  
thing.
So, I subjected just genes with duplicate probesets to the above
procedure and, sure enough, in an overwhelming number of cases,
probesets from the same gene plot next to each other.

My conclusion from this exercise is that what I am doing is  
empirically
correct, although I am open to suggestions as to how it could be
improved or comments as to how I may be just plain wrong.

Doug, I am reading your book and appreciate your contributions.

Mark

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work, & Mobile & VoiceMail
(317) 204-4202 Home (no voice mail please)

mwkimpel<at>gmail<dot>com

******************************************************************

Douglas Bates wrote:
On Fri, Feb 22, 2008 at 11:57 AM, Mark W Kimpel  
<mwkimpel at gmail.com> wrote:
This is my first foray into in mixed models and, while awaiting the
arrival of:

Extending the Linear Model with R: Generalized Linear, Mixed Effects
and     Nonparametric Regression Models
Mixed Effects Models in S and S-Plus

I am in need to some advice.

I would like to look at gene-gene correlations within a multi- 
factorial,
mixed effects experiment. Here are the factors, with levels:

Gene Expression: 2 different genes per Animal, continuous variable
Animals: 6 per Strain
Tissues: 3 per animal

Strain: 2

I thus have 6*3*2 = 36 samples

I do not care, for this analysis, about differences between Tissues,
Strains, or Animals, in fact, I want to control for them while  
examining
the correlation of expression of the two genes. In other words, I  
want
look at something very much like the Pearson correlation coefficient
controlled for these other factors.

I guess the first question I should ask is: "is a mixed model the  
way to
go, and, if not, what would be the correct approach?"
Perhaps.  How do you plan to incorporate the two genes?

Assuming mixed models will work, as I see it through my newbie eyes,
Tissue and strain are fixed effects and animals are random effects.
If you were interested in just 1 gene than I would say that this  
looks
like a good approach.  I'm just not sure what to do about the  
multiple
genes.

Any suggestions for an approach and a model?
The model specification (assuming that each animal has a distinct
number) would be something like

gene1 ~ Tissue * Strain + (1|Animal)

In your earlier message to the Bioconductor list you had a
specification that looked like

gene1 ~ gene2 + ...

which makes me a little queasy because you are assuming that gene2 is
"known" relative to the variability in gene1 and most of the time  
that
is not a reasonable approach.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

need help with mixed effects model

Thread (5 messages)