Skip to content
Prev 78404 / 398502 Next

testing non-linear component in mgcv:gam

Hi,

I need further help with my GAMs. Most models I test are very  
obviously non-linear. Yet, to be on the safe side, I report the  
significance of the smooth (default output of mgcv's summary.gam) and  
confirm it deviates significantly from linearity.

I do the latter by fitting a second model where the same predictor is  
entered without the s(), and then use anova.gam to compare the two. I  
thought this was the equivalent of the default output of anova.gam  
using package gam instead of mgcv.

I wonder if this procedure is correct because one of my models  
appears to be linear. In fact mgcv estimates df to be exactly 1.0 so  
I could have stopped there. However I inadvertently repeated the  
procedure outlined above. I would have thought in this case the  
anova.gam comparing the smooth and the linear fit would for sure have  
been not significant. To my surprise, P was 6.18e-09!

Am I doing something wrong when I attempt to confirm the non- 
parametric part a smoother is significant? Here is my example case  
where the relationship does appear to be linear:

library(mgcv)
Temp <- c(-1.38, -1.12, -0.88, -0.62, -0.38, -0.12, 0.12, 0.38, 0.62,  
0.88, 1.12,
            1.38, 1.62, 1.88, 2.12, 2.38, 2.62, 2.88, 3.12, 3.38,  
3.62, 3.88,
            4.12, 4.38, 4.62, 4.88, 5.12, 5.38, 5.62, 5.88, 6.12,  
6.38, 6.62, 6.88,
            7.12, 8.38, 13.62)
N.sets <- c(2, 6, 3, 9, 26, 15, 34, 21, 30, 18, 28, 27, 27, 29, 31,  
22, 26, 24, 23,
             15, 25, 24, 27, 19, 26, 24, 22, 13, 10, 2, 5, 3, 1, 1,  
1, 1, 1)
wm.sed <- c(0.000000000, 0.016129032, 0.000000000, 0.062046512,  
0.396459596, 0.189082949,
             0.054757925, 0.142810440, 0.168005168, 0.180804428,  
0.111439628, 0.128799505,
             0.193707937, 0.105921610, 0.103497845, 0.028591837,  
0.217894389, 0.020535469,
             0.080389068, 0.105234450, 0.070213450, 0.050771363,  
0.042074434, 0.102348837,
             0.049748344, 0.019100478, 0.005203125, 0.101711864,  
0.000000000, 0.000000000,
             0.014808824, 0.000000000, 0.222000000, 0.167000000,  
0.000000000, 0.000000000,
             0.000000000)

sed.gam <- gam(wm.sed~s(Temp),weight=N.sets)
summary.gam(sed.gam)
# testing non-linear contribution
sed.lin <- gam(wm.sed~Temp,weight=N.sets)
summary.gam(sed.lin)
anova.gam(sed.lin, sed.gam, test="F")
Thanks in advance,


Denis Chabot