Skip to content

Drop1 and weights

5 messages · Yan Wong, Brian Ripley

#
Hi,

If I used drop1 in a weighted lm fit, it seems to ignore the weights  
in the AIC calculation of the dropped terms, see the example below.  
Can this be right?

Yan

--------------------

library(car)
 > unweighted.model <- lm(trSex ~ (river+length +depth)^2- 
length:depth, dno2)
 > Anova(unweighted.model)
Anova Table (Type II tests)

Response: trSex
              Sum Sq  Df F value  Pr(>F)
river        0.1544   1  3.3151 0.07100 .
length       0.1087   1  2.3334 0.12911
depth        0.1917   1  4.1145 0.04461 *
river:length 0.0049   1  0.1049 0.74652
river:depth  0.3008   1  6.4567 0.01226 *
Residuals    5.9168 127
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
 >
 > drop1(unweighted.model)
Single term deletions

Model:
trSex ~ (river + length + depth)^2 - length:depth
              Df Sum of Sq     RSS     AIC
<none>                       5.92 -401.97
river:length  1  0.004889    5.92 -403.86
river:depth   1      0.30    6.22 -397.37

### Both drop1 & Anova suggest dropping river:length
### Compare with the following:

 > weighted.model <- lm(trSex ~ (river+length +depth)^2-length:depth,  
dno2, weights=males+females)
 > Anova(weighted.model)
Anova Table (Type II tests)

Response: trSex
              Sum Sq  Df F value  Pr(>F)
river         1.471   1  3.3775 0.06843 .
length        1.002   1  2.2999 0.13187
depth         2.974   1  6.8295 0.01005 *
river:length  0.075   1  0.1733 0.67790
river:depth   2.020   1  4.6397 0.03313 *
Residuals    55.303 127
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
 >
 > drop1(weighted.model)
Single term deletions

Model:
trSex ~ (river + length + depth)^2 - length:depth
              Df Sum of Sq     RSS     AIC
<none>                      55.30 -104.71
river:length  1    -49.33    5.97 -402.79
river:depth   1    -49.04    6.26 -396.39
#
It looks like at some point weighted.residuals() was changed, and broke 
this.  The models are fitted correctly, but AIC is wrong.  However, note 
that it does work correctly for a glm() fit which gives you a workaround.

You example is not reproducible, so I was unable to test the workaround 
nor the correction which will short;y be in R-patched.
On Wed, 1 Mar 2006, Yan Wong wrote:

            

  
    
#
On 2 Mar 2006, at 13:25, Prof Brian Ripley wrote:

            
Thank you.
Indeed, I realised this, but assumed that the problem could be  
understood even without my dataset. I'll test my data on the patched  
version when it becomes available, and re-post then.

Thanks again

Yan
#
On Thu, 2 Mar 2006, Yan Wong wrote:

            
Now I have tried a similar example, I think that the workaround is correct 
but potentially confusing due to another error elsewhere.  Using drop1 on 
a gaussian glm fit with weights gives the correct deviance and the correct 
AICs up to an additive constant.  However, AIC() on such a fit gives the 
wrong AIC, since gaussian()$aic is incorrect (or at least it treats the 
weights as case weights, which is not the interpretation everything else 
uses). (I've changed that in R-devel.)

So in 2.2.1 patched
Single term deletions

Model:
time ~ dist + climb
        Df Sum of Sq    RSS    AIC
<none>              437.64  94.41
dist    1    164.05 601.68 103.55
climb   1      8.66 446.29  93.10
Single term deletions

Model:
time ~ dist + climb
        Df Deviance    AIC
<none>      437.64  21.30
dist    1   601.68  30.44
climb   1   446.29  19.98

and in R-devel
Single term deletions

Model:
time ~ dist + climb
        Df Deviance    AIC
<none>      437.64 323.87
dist    1   601.68 333.01
climb   1   446.29 322.55

(These differ only in which additive constants are included, with that in
drop1.glm in 2.2.1 patched being based on an incorrect calculation, 
which for this purpose does not matter.)
5 days later
#
On 2 Mar 2006, at 17:56, Prof Brian Ripley wrote:

            
I've just tried R-patched, and it now works as expected. Thanks for  
the quick fix. As you point out, the glm and lm versions differ in  
AIC reported, but only by an additive constant, and so the tests are  
not affected.

Yan

--------------------

 > weighted.lm <- lm(trSex ~ (river+length +depth)^2-length:depth,  
dno2, weights=males+females)
 > d1.lm <- drop1(weighted.lm, test="F");
 > weighted.glm <- glm(trSex ~ (river+length +depth)^2-length:depth,  
dno2, weights=males+females, family="gaussian")
 > d1.glm <- drop1(weighted.glm, test="F");
 >
 > #differ by a constant
 > d1.lm$AIC - d1.glm$AIC
[1] 628.1381 628.1381 628.1381
 >
 > Anova(weighted.lm)
Anova Table (Type II tests)

Response: trSex
              Sum Sq  Df F value  Pr(>F)
river         1.471   1  3.3775 0.06843 .
length        1.002   1  2.2999 0.13187
depth         2.974   1  6.8295 0.01005 *
river:length  0.075   1  0.1733 0.67790
river:depth   2.020   1  4.6397 0.03313 *
Residuals    55.303 127
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
 > d1.lm
Single term deletions

Model:
trSex ~ (river + length + depth)^2 - length:depth
              Df Sum of Sq      RSS      AIC F value   Pr(F)
<none>                      55.303 -104.710
river:length  1     0.075   55.379 -106.529  0.1733 0.67790
river:depth   1     2.020   57.324 -101.938  4.6397 0.03313 *
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
 > d1.glm
Single term deletions

Model:
trSex ~ (river + length + depth)^2 - length:depth
              Df Deviance     AIC F value   Pr(F)
<none>             55.30 -732.85
river:length  1    55.38 -734.67  0.1733 0.67790
river:depth   1    57.32 -730.08  4.6397 0.03313 *
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1