Skip to content

F test

7 messages · kayj, Jun Shen, Michael Lawrence +1 more

#
summary(my_lm) will give you t-values, anova(my_lm) will give you
(equivalent) F-values. summary() might be preferred because it also
provides the estimates & SE.
Call:
lm(formula = dv ~ iv1 * iv2, data = a)

Residuals:
    Min      1Q  Median      3Q     Max
-1.8484 -0.2059  0.1627  0.4623  1.0401

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  -0.4864     0.4007  -1.214    0.270
iv1           0.8233     0.5538   1.487    0.188
iv2           0.2314     0.3863   0.599    0.571
iv1:iv2      -0.4110     0.5713  -0.719    0.499

Residual standard error: 1.017 on 6 degrees of freedom
Multiple R-squared: 0.3161,	Adjusted R-squared: -0.02592
F-statistic: 0.9242 on 3 and 6 DF,  p-value: 0.4842
Analysis of Variance Table

Response: dv
          Df Sum Sq Mean Sq F value Pr(>F)
iv1        1 1.9149  1.9149  1.8530 0.2223
iv2        1 0.4156  0.4156  0.4021 0.5494
iv1:iv2    1 0.5348  0.5348  0.5175 0.4990
Residuals  6 6.2004  1.0334
On Thu, Apr 16, 2009 at 10:35 AM, kayj <kjaja27 at yahoo.com> wrote:

  
    
#
I'm new to LME myself, so it would be best for others to advise on this.
On Thu, Apr 16, 2009 at 3:00 PM, Jun Shen <jun.shen.ut at gmail.com> wrote:

  
    
#
Le jeudi 16 avril 2009 ? 14:08 -0300, Mike Lawrence a ?crit :
Ahem. "Equivalent", my tired foot...

In simple terms (the "real" real story may be more intricate....) :

The "F values" stated by anova are something entierely different of t
values in summary. The latter allow you to assess properties of *one*
coefficient in your model (namely, do I have enough suport to state that
it is nonzero ?). The former allows you to assess whether you have
support for stating that *ALL* the coefficient related to the same
factor cannot be *SIMULTANEOUSLY* null. Which is a horse of quite
another color...

By the way : if your "summary" indeed does give you the mean^K^K an
unbiased estimate of your coefficient and an (hopefully) unbiased
estimate of its standard error, the "F" ration is the ratio of estimates
of "remaining" variabilities with and without the H0 assumption it
tests, that is that *ALL* coefficients of your factor of interest are
*SIMULTANEOUSLY* null.

F and t "numbers" will be "equivalent" if and only if your "factor of
interest" needs only one coefficient to get expressed, i. e. is a
continuous covariable or a two-class discrete variable (such as
boolean). In this case, you can test your factor either by the t value
which, under H0, fluctuates as a Student's t with n_res dof (n_res being
the "residual degrees of freedom" of the model) or by the F value, which
will fluctuate as a Fisher F statistic with 1 and n_res dof, which
happens (but that's not happenstance...) to be the *square* of a t with
n_dof.

May I suggest consulting a textbook *before* flunking ANOVA 101 ?

					Emmanuel Charpentier
#
My bad, I wasn't paying attention.
Harsh but warranted given my carelessness.


On Thu, Apr 16, 2009 at 3:47 PM, Emmanuel Charpentier
<charpent at bacbuc.dyndns.org> wrote:

  
    
#
Le jeudi 16 avril 2009 ? 13:00 -0500, Jun Shen a ?crit :
With lme, you have to specify a *list* of random effects as the
"random=" argument, which, off the top of my (somewhat tired) limbic
system,  is specified by an "lmList", iirc. But I'm pretty sure that
"?lme" will give you much better information...

lmer is simpler. Say you basic model is "Z~Y" with another variable X
acting as a possible source of random variation. You express random
effects by adding (1|X) terms (meaning X is a simple (intercept) random
effect) to the model, or (Y|X), which specifies both an intercept random
effect of X and an effect of the slope of the (supposed) line binding Y
to the independent variable. If you want to specify a possible variation
of slopes with no possible intercept effect, say "Z~Y+(0+Y|X)" (unless
I'm mistaken : I never used that, because I never had to work on such a
model in a context where it would make sense...).
Ah. That's the (multi-)million dollars question. You're dealing with a
fixed*random interaction, which has a totally different meaning of a
fixed*fixed interaction. The test of the latter means "does the
(possible) effect of the first factor vary with levels of the second
factor ?", whereas the second reads as "does the random factor increase
variability about what I know of the effect of the fixed factor ?", with
totally different consequences.

Pinhero and Bates' book (2000), which among other thing, describes the
care, feeding and drinks of lme, give explanations about 1) why the
problem is computationnaly hard 2) how to get an approximate answer and
3) in which proportion previous advice might be misleading. This book
is, IMHO, required reading for anybody with more than a passing interest
on the subject, and I won't paraphrase it...

Since then, Pr Bates started to develop lme4, which has different inner
algorithms. In the cours of this work, he started to have doubts about
the "conventional wisdom" about testing effects in mixed models, and did
not (yet) provide these "conventionals" means of testing. Instead, he
seems to work along the lines of MCMC sampling to assess various aspects
of mixed models. But there I suggest to walk along the R-SIG-mixed-model
archives, which are *very* interesting.

Of course, if you want an authoritative answer (i. e. an answer that
will pass a medical journal's reviewer unquestioned), you can always use
SAS' proc mixed. But I wouldn't swear this answer is "exact", or even
sensible, as far as I can judge...

Pr Bates seems to answer readily any (sensible) questions on the ME
mailing list, where you will also find folks much more authorized than
yours truly to answer this and that question...

HTH,

					Emmanuel Charpentier