Skip to content

Coefficient of determination (R^2) when using lme()

7 messages · R.S. Cotter, Martin Henry H. Stevens, vito muggeo +4 more

#
Dear mixed models users,

I have recently started using R, and I have learned to use lme ().

Is it possible to interpret coefficient of determination (R^2) when
using lme ()?


Best Regards

R.S. Cotter
#
Hi R.S.,
This quantity is not clearly defined for mixed models --- should it  
include that which is "explained" by the random effects? What would  
it mean to "explain" a response with a variance? In any event, try  
searching R-help lists for Coefficient of determination AND lme.
Cheers,
Hank
On Apr 1, 2008, at 6:17 AM, R.S. Cotter wrote:
Dr. Hank Stevens, Assistant Professor
338 Pearson Hall
Botany Department
Miami University
Oxford, OH 45056

Office: (513) 529-4206
Lab: (513) 529-4262
FAX: (513) 529-4243
http://www.cas.muohio.edu/~stevenmh/
http://www.cas.muohio.edu/ecology
http://www.muohio.edu/botany/

"If the stars should appear one night in a thousand years, how would men
believe and adore." -Ralph Waldo Emerson, writer and philosopher  
(1803-1882)
#
Dear R.S. Cotter,
I think that interpretation of R2 is not straightforward and it is area 
of research.. Have a look to

Xu. Measuring explained variation in linear mixed effects models 
Statist. Med. 2003; 22:3527?3541 (DOI: 10.1002/sim.1572)

Orelien, J.G., Edwards, L.J., Fixed-effect variable selection in linear 
mixed models using R2 statistics Comput. Statist.
Data Anal. (2007), doi: 10.1016/j.csda.2007.06.006

Hope this helps you,

vito



R.S. Cotter ha scritto:

  
    
#
This came up with a reviewer when I was using glms as well. I've  
become fond of using the R^2 of the correlation between the fitted and  
observed values.  It's easily interpretable by a general audience.

r2.corr.lmer<-function(lmer.object){
         summary(lm(attr(lmer.object, "y") ~ fitted (lmer.object))) 
$r.squared}
On Apr 1, 2008, at 3:37 AM, MHH Stevens wrote:

            
#
That's going to break in the next version of R (due out later this month).

Use

 lmer.object at y

not

 attr(lmer.object, "y")

Slots in S4 classed objects were initially implemented as attributes
but they are not attributes.

In general, if you want to determine the structure of an object, use
the str() function.  It's even better to use the appropriate extractor
functions as the value of the extractor function should be consistent
across versions of the package but the structure of the object changes
between versions.  The appropriate extractor in this case is

model.response(lmer.object)
On Tue, Apr 1, 2008 at 9:40 AM, Jarrett Byrnes <jebyrnes at ucdavis.edu> wrote:
#
My $0.02.

Gelman also has an excellent article, but he uses Bayes to estimate explained variance, so it may not be as straightforward as other methods.

[2006] Bayesian measures of explained variance and pooling in multilevel (hierarchical) models. Technometrics, 48(2), 241--251. (Andrew Gelman and Iain Pardoe)

I personally am not a fan of simply correlating the fitted values with the raw scores. The problem, as I see it, is that you ran the multilevel model because you wanted to honor the nesting structure (for any number of reasons). I see doing this almost like when people run ANOVAs as a post hoc for a MANOVA. If your analysis is multilevel, then produce a statistic for understanding explained variance that is also multilevel. By the way, I have come full circle on this. I used to think that we needed a single metric to tell us about explained variance in a model (see http://www.hlm-online.com/papers/). Now, I'm not so sure.

One other problem is that unlike the OLS counterpart, in multilevel analysis you can actually ADD variance to your model through the addition of covariates/predictors. This is often a sign of model misspecification, but it can also occur when the model is correctly specified (and no, group mean centering won't always fix this problem). If you do a search on the multilevel listserv, you can see this discussed in length in multiple threads. You can also see a discussion of this in Snijders & Bosker (1999, p. 99-109)

Hope this helps,
Kyle

********************************************************
Dr. J. Kyle Roberts
Department of Literacy, Language and Learning
School of Education and Human Development
Southern Methodist University
P.O. Box 750381
Dallas, TX  75275
214-768-4494
http://www.hlm-online.com/
********************************************************

-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of vito muggeo
Sent: Tuesday, April 01, 2008 6:55 AM
To: cotterrs at gmail.com
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Coefficient of determination (R^2) when using lme()

Dear R.S. Cotter,
I think that interpretation of R2 is not straightforward and it is area 
of research.. Have a look to

Xu. Measuring explained variation in linear mixed effects models 
Statist. Med. 2003; 22:3527-3541 (DOI: 10.1002/sim.1572)

Orelien, J.G., Edwards, L.J., Fixed-effect variable selection in linear 
mixed models using R2 statistics Comput. Statist.
Data Anal. (2007), doi: 10.1016/j.csda.2007.06.006

Hope this helps you,

vito



R.S. Cotter ha scritto:

  
    
#
The question should be: "What is one trying to estimate?"
Or "What is one trying to measure?"  Until that is settled,
no amount of research will go anywhere useful.  Once it
is settled, an answer may be quickly forthcoming.

R^2 ought not to be treated as a quantity that has a magic
that is independent of meaningfulness.  Often, it has no
meaningfulness that is relevant to the intended use of the
regression results.  If used at all adjusted R^2 is preferable
to R^2.

R^2 is a design measure, estimating how effectively
the data are designed to extract a regression signal.
Change the design (e.g., in a linear regression by
doubling the range of values of the explanatory variable),
and one changes (in this case, very substantially
increases) the expected value of R^2.

It can also be used as a rather crude way to compare two
models for the one set of data, i.e., with the same 'design'.
But be careful, replacing y by log(y) can increase R^2
and give a model that fits less well, or vice versa.
Consider why that might be!

What aspect of the 'design' that underpins your multilevel
model do you wish to characterize?

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
On 1 Apr 2008, at 10:54 PM, vito muggeo wrote: