Determining variance components of classed covariates
You might want to try the https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models list next time for mixed model questions. At any rate the Variance column figures are variances, not percentages. We can use anova with REML=FALSE to make comparisons among models. Below we find that removing the rs7074431 term makes very little difference so we can drop it but removing the rs11834524 term makes a big difference. Thus modx0 can be used.
modxx <- lmer(Expression ~ 1 + (1|rs11834524) + (1|rs7074431), input_new, REML = FALSE) modx0 <- lmer(Expression ~ 1 + (1|rs11834524), input_new, REML = FALSE) mod0x <- lmer(Expression ~ 1 + (1|rs7074431), input_new, REML = FALSE) anova(modxx, modx0)
Data: input_new
Models:
modx0: Expression ~ 1 + (1 | rs11834524)
modxx: Expression ~ 1 + (1 | rs11834524) + (1 | rs7074431)
Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
modx0 3 111.386 119.460 -52.693
modxx 4 110.288 121.053 -51.144 3.0986 1 0.07836 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(modxx, mod0x)
Data: input_new
Models:
mod0x: Expression ~ 1 + (1 | rs7074431)
modxx: Expression ~ 1 + (1 | rs11834524) + (1 | rs7074431)
Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
mod0x 3 206.652 214.726 -100.326
modxx 4 110.288 121.053 -51.144 98.365 1 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
On Mon, Jan 12, 2009 at 11:36 AM, Stephen Montgomery <sm8 at sanger.ac.uk> wrote:
Hi - I am interested in solving variance components for the data below with respect to the response variable, Expression within R. However, the covariates aren't independent and they also have a class (of which the total variance explained by covariates in that class I am most interested in). Very naively, I have tried to look at each individual covariates variance like this
lm<-lmer(Expression ~ 1 + (1|rs11834524) + (1|rs7074431),
data=input_new)
lm
Linear mixed-effects model fit by REML
Formula: Expression ~ 1 + (1 | rs11834524) + (1 | rs7074431)
Data: input
AIC BIC logLik MLdeviance REMLdeviance
108.4 116.5 -51.22 102.5 102.4
Random effects:
Groups Name Variance Std.Dev.
rs11834524 (Intercept) 0.485538 0.69681
rs7074431 (Intercept) 0.013720 0.11713
Residual 0.128853 0.35896
number of obs: 109, groups: rs11834524, 3; rs7074431, 3
Fixed effects:
Estimate Std. Error t value
(Intercept) 9.9524 0.4098 24.29
My assumption is that this is telling me that rs11834524 explains
0.485538 of the variance and rs7074431 explains 0.013720 of the variance
in Expression when looked at independently.
However, I would like to know how to write a model where I know how much
of the total variance (in Expression) is described by covariates
rs11834524, rs1682421, rs13383869 and rs9457141 (call it class A) and
covariates rs9459617, rs7074431, rs12450785, rs592724 (call it class B).
Assuming an additive model within the class. The caveats are that there
is missing data and again that there may be correlation between all the
covariates.
Such that a theoretical result may be that
Class A: Explains 60% of the total variance in expression (response)
Class B: Explains 10% of the total variance in expression
Thanks for the help! I am sorry I am R challenged here...I really
appreciate the guidance!
Stephen
dump("input_new", file=stdout())
input_new <-
structure(list(Individual = structure(1:109, .Label = c("NA06984",
"NA06985", "NA06986", "NA06989", "NA06993", "NA06994", "NA07000",
"NA07022", "NA07037", "NA07045", "NA07051", "NA07055", "NA07056",
"NA07345", "NA07346", "NA07347", "NA07357", "NA07435", "NA11829",
"NA11830", "NA11831", "NA11832", "NA11839", "NA11840", "NA11843",
"NA11881", "NA11882", "NA11892", "NA11893", "NA11894", "NA11917",
"NA11918", "NA11919", "NA11920", "NA11930", "NA11931", "NA11992",
"NA11993", "NA11994", "NA11995", "NA12003", "NA12005", "NA12006",
"NA12043", "NA12044", "NA12056", "NA12057", "NA12144", "NA12145",
"NA12146", "NA12154", "NA12155", "NA12156", "NA12234", "NA12239",
"NA12248", "NA12249", "NA12264", "NA12272", "NA12273", "NA12274",
"NA12275", "NA12282", "NA12283", "NA12286", "NA12287", "NA12340",
"NA12341", "NA12342", "NA12343", "NA12347", "NA12348", "NA12383",
"NA12399", "NA12400", "NA12414", "NA12489", "NA12546", "NA12716",
"NA12718", "NA12748", "NA12749", "NA12750", "NA12751", "NA12760",
"NA12761", "NA12762", "NA12763", "NA12775", "NA12776", "NA12777",
"NA12778", "NA12812", "NA12813", "NA12814", "NA12815", "NA12827",
"NA12828", "NA12829", "NA12830", "NA12842", "NA12843", "NA12872",
"NA12873", "NA12874", "NA12875", "NA12889", "NA12891", "NA12892"
), class = "factor"), Expression = c(9.46026823453575, 10.0788903323991,
9.20330296497174, 10.038741467793, 9.33092349416463, 11.0273957217919,
10.5498875891745, 9.81137299592747, 11.2023261987976, 9.90559354069027,
10.1524696609679, 10.3171767665993, 9.02155519577685, 9.84917871051438,
10.658877473136, 9.88895551011107, 8.62335008726357, 9.21529114100886,
10.7896248923916, 10.1302992505869, 8.64584282787018, 9.56057795866654,
9.89810004078774, 10.2557482141576, 8.95588077688637, 9.56452454115857,
9.26525135092154, 10.5438780642797, 9.8468571349548, 10.7416169225352,
10.5623721612979, 10.6565276881443, 9.67758493445612, 9.75385553511462,
8.997797236767, 11.0106882086179, 10.362578597992, 9.2745507212906,
10.7453355016181, 9.75998268015348, 9.45003620116962, 10.055504292376,
10.7072220720564, 10.0934686444392, 10.0472832129727, 10.1185615033486,
10.3340911031131, 9.70618910683157, 10.5953304905529, 10.4246307909547,
9.91463202635336, 10.249081562168, 10.9252022586474, 10.295544143525,
11.4838109797985, 10.5286570234792, 9.78692800868132, 10.0397050809162,
9.27914623343747, 10.37600233389, 9.27341681588134, 9.40195375611303,
10.8979822929135, 9.03922228977389, 10.3911745662505, 10.4345408213054,
9.8548491618724, 10.1897729275437, 10.2881888849609, 8.9656977165014,
9.81595398472166, 10.1856794532084, 9.3763789479684, 10.1712420020647,
10.2964594680427, 10.3515965292101, 8.94492585275159, 11.2529257614993,
9.25146912450726, 10.1904309237525, 10.7490591053023, 10.3883924463568,
10.097023765247, 10.0824730785217, 10.0828512817661, 10.6371064852226,
10.5831044752098, 10.4484786486601, 8.50264408341596, 10.3468670812262,
9.46061433005316, 8.90027436167269, 9.73630671555279, 9.40555522408144,
10.3220768104446, 8.55132985773453, 10.1678182524815, 10.6145417864386,
10.4169948161073, 10.0253039670548, 10.2568017077865, 10.5045847076951,
9.75993936712448, 8.99997092895909, 10.6742222414794, 10.8640943324257,
10.4295384371541, 10.1987862649656, 10.6744617172313), rs11834524 =
structure(c(1L,
2L, 2L, 3L, 2L, 3L, 3L, 2L, 3L, 2L, 3L, 2L, 1L, 2L, 2L, 2L, 1L,
1L, 3L, 3L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 3L, 3L, 3L, 2L,
1L, 1L, 3L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
2L, 2L, 2L, 3L, 2L, 3L, 3L, 2L, 3L, 1L, 2L, 1L, 1L, 3L, 1L, 2L,
2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 3L, 1L, 2L, 3L,
2L, 3L, 2L, 1L, 3L, 3L, 3L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 1L, 1L, 3L, 3L, 3L, 3L, 3L), .Label = c("AA",
"AG", "GG"), class = "factor"), rs1682421 = structure(c(1L, 2L,
1L, 2L, 2L, 3L, 2L, 2L, 3L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L,
2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 3L, 2L, 3L, 1L, 1L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 3L, 1L, 2L, 2L,
1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L,
3L, 1L, 1L, 2L, 3L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L,
1L, 2L, 2L, 1L, 1L, NA, 3L, 2L, 3L, 2L, 2L), .Label = c("CC",
"CT", "TT"), class = "factor"), rs13383869 = structure(c(2L,
2L, 2L, 2L, 2L, NA, 2L, 2L, 1L, 2L, 3L, 3L, 3L, 1L, 2L, 2L, 3L,
2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 1L, 1L, 2L, 2L,
2L, 3L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 2L, 1L,
1L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 3L, 2L, NA, 2L, 2L, 3L, 2L,
2L, 2L, 2L, 1L, 2L, 3L, 2L, 2L, 1L, 1L, 2L, 3L, 2L, 2L, 3L, 2L,
2L, 1L, 1L, 2L, 1L, 1L, 1L, 3L, 1L, 2L, 3L, 2L, 3L, 2L, 3L, 2L,
1L, 1L, 2L, 2L, NA, 2L, 1L, 1L, 2L, 2L, 1L, 1L), .Label = c("AA",
"AG", "GG"), class = "factor"), rs9457141 = structure(c(1L, 2L,
1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 3L, 1L,
3L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L,
2L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 1L, 1L, 3L, 1L, 1L, 2L, 1L, 2L, 3L, 2L, 1L,
2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, NA, 2L, 1L, 2L, NA, 1L,
1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("CC",
"CT", "TT"), class = "factor"), rs9459617 = structure(c(1L, 2L,
1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 3L, 1L,
3L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L,
2L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 1L, 1L, 3L, 1L, 1L, NA, 1L, 3L, 3L, 2L, 1L,
2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 1L, 2L, 1L, 2L, 2L, 1L,
1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("CC",
"CT", "TT"), class = "factor"), rs7074431 = structure(c(2L, 3L,
2L, 1L, 3L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 3L, 2L,
2L, 2L, 2L, 3L, 2L, 3L, 2L, 2L, 1L, 1L, 3L, 2L, 1L, 2L, 3L, 2L,
1L, 2L, 1L, 3L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 3L, 1L, 1L,
2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 1L, 2L, 2L, 1L,
1L, 1L, 3L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 3L, 3L, 3L, 1L, 1L, 1L,
1L, 1L, 2L, 1L, 2L, 1L, 3L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L), .Label = c("CC",
"CT", "TT"), class = "factor"), rs12450785 = structure(c(2L,
2L, 2L, 2L, 2L, 2L, 1L, 3L, 1L, 3L, 3L, 2L, 2L, 1L, 2L, 3L, 2L,
3L, 1L, 3L, 3L, 3L, 2L, 2L, 3L, 3L, 2L, 2L, 3L, 2L, 3L, 2L, 2L,
2L, 2L, 2L, 1L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 2L,
1L, 2L, 2L, 1L, 1L, 2L, 3L, 3L, 2L, 3L, 3L, 3L, 2L, 1L, 3L, 2L,
2L, 3L, 2L, 2L, 3L, 3L, 2L, 2L, 3L, 2L, 2L, 3L, 1L, 3L, 2L, 2L,
1L, 3L, 2L, 3L, 1L, 3L, 2L, 3L, 3L, 2L, 2L, 2L, 3L, 2L, 3L, 1L,
2L, 2L, 3L, 2L, 2L, 1L, 3L, 3L, 3L, 2L, 3L, 2L), .Label = c("AA",
"AG", "GG"), class = "factor"), rs592724 = structure(c(1L, 2L,
1L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 1L, 2L, 3L, 2L, 2L, 1L, 2L,
2L, 2L, 1L, 2L, 2L, 1L, 1L, 3L, 1L, 1L, 2L, 2L, 2L, 3L, 1L, 3L,
1L, 3L, 2L, 1L, 1L, 2L, 1L, 2L, 3L, 2L, 2L, 2L, 2L, 1L, 2L, 2L,
3L, 1L, 3L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 3L, 1L, 3L, 3L,
2L, 2L, 1L, 1L, 3L, 2L, 2L, 2L, 1L, 3L, 2L, 3L, 1L, 3L, 3L, 2L,
2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L,
1L, 1L, 2L, 1L, 2L, 1L, 2L, 3L, 2L, 2L, 2L), .Label = c("CC",
"CT", "TT"), class = "factor"), Grp = structure(c(1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "1", class =
"factor")), .Names = c("Individual",
"Expression", "rs11834524", "rs1682421", "rs13383869", "rs9457141",
"rs9459617", "rs7074431", "rs12450785", "rs592724", "Grp"), row.names =
c(NA,
-109L), class = "data.frame")
Stephen B. Montgomery
Postdoctoral Researcher, Population and Comparative Genomics
Wellcome Trust Sanger Institute
Hinxton, Cambridge CB10 1SA
Skype: stephen.b.montgomery
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.