Skip to content

poly() and rcs()?

4 messages · Antoine Tremblay, vito muggeo, Paul Johnson

#
Dear all,

Here is a question regarding the difference between using poly() and rcs().

If you fit a model using poly(), as shown below, the summary returns
statistics for the term "linear" term,
poly(TrialContinuous,2,raw=TRUE)1, and statistics for the quadratic,
poly(TrialContinuous,2,raw=TRUE)2. Here, the summary tells us that the
"linear" and quadratic terms are significant, that the interaction
between the linear term and the factor variable FreqGroup is
significant, but that the interaction between the quadratic and
FreqGroup is not significant.
Linear mixed model fit by REML
Formula: LogRT ~ poly(TrialContinuous, 2, raw = TRUE) * FreqGroup + (1
|      Subject)
   Data: dat.li
   AIC   BIC     logLik deviance REMLdev
 17199 17266  -8592    17056   17183
Random effects:
 Groups   Name        Variance   Std.Dev.
 Subject  (Intercept)   0.014825  0.12176
 Residual                  0.101560  0.31868
Number of obs: 30742, groups: Subject, 25

Fixed effects:

         Estimate    Std. Error    t value
(Intercept)
     6.307e+00  2.456e-02    256.85
poly(TrialContinuous, 2, raw = TRUE)1                       -1.570e-04
  4.639e-06   -33.84
poly(TrialContinuous, 2, raw = TRUE)2                        9.779e-08
  1.120e-08    8.73
FreqGroupLow
3.299e-02   6.085e-03    5.42
poly(TrialContinuous, 2, raw = TRUE)1:FreqGroupLow  1.769e-05
8.736e-06    2.03
poly(TrialContinuous, 2, raw = TRUE)2:FreqGroupLow -1.168e-08
2.123e-08   -0.55

Now, if you use rcs() instead, the summary return statistics for the
first and second splines, as shown below. We see that the 2 splines
are significant, but there is no interaction between any of the
splines and FreqGroup. This reflects what we see in the model fitted
with poly(), where the the linear and quadratic terms were significant
but the quadratic by FreqGroup interaction was not significant. What's
missing in the model fitted with rcs() is the linear term.
Linear mixed model fit by REML
Formula: LogRT ~ rcs(TrialContinuous, 3) * FreqGroup + (1 | Subject)
   Data: dat.li
   AIC   BIC     logLik deviance REMLdev
 17180 17247  -8582    17065   17164
Random effects:
 Groups   Name        Variance   Std.Dev.
 Subject  (Intercept)  0.014826   0.12176
 Residual                 0.101592   0.31874
Number of obs: 30742, groups: Subject, 25

Fixed effects:

         Estimate    Std. Error    t value
(Intercept)
     6.286e+00  2.498e-02    251.62
rcs(TrialContinuous, 3)TrialContinuous
-2.538e-04  1.253e-05   -20.26
rcs(TrialContinuous, 3)TrialContinuous'
1.285e-04   1.547e-05    8.31
FreqGroupLow
3.599e-02   1.061e-02    3.39
rcs(TrialContinuous, 3)TrialContinuous:FreqGroupLow   3.018e-05
2.379e-05    1.27
rcs(TrialContinuous, 3)TrialContinuous':FreqGroupLow -1.660e-05
2.936e-05   -0.57

I tried adding a linear term "TrialContinuous" to the model fitted
with rcs(), as shown below, but that doesn't work, the model is not
positive definite.
The question is where is the linear term (if there is one)?
Is it hidden somewhere? Or is it simply not a question to ask when
using rcs() to fit models?
Should I use poly() rather than rcs()?
Are there situations where you would want to use poly() over rcs() and
vice versa?

Additionally, how should we interpret the statistics for the splines?
Does their significance mean that the slopes in the splines are
significantly different than 0?
In the case of the second spline, does it's significance mean that it
is significantly different than the first spline?

Thank you very much for your time and efforts. Your help is greatly appreciated.
Sincerely

--
Antoine Tremblay
Department of Neuroscience
Georgetown University
Washington DC
#
dear Antoine,
I don't know the rcs() function (from which package..?)

BTW, why do you want add a linear term? I presume it is "included" in 
rcs().. I mean the linear term you want to include is linear combination 
of two bases (colums) returned by rcs(). So if you want include it, you 
should remove one column of rcs()...

Also the parameter estimates you get depend on parameterization of 
poly() and rcs(). Using basis splines it is meaningless to test for some 
spline coefficient to be different from zero.. I imagine, again I don't 
know rcs(), the same holds for rcs().
If you want to simplify your model I suggest to use differences in log 
likelihoods

Hope this helps you,
vito

Antoine Tremblay ha scritto:

  
    
1 day later
#
On Sun, Nov 15, 2009 at 2:20 PM, Antoine Tremblay <trea26 at gmail.com> wrote:
[snip]
Dear Antoine:

I am not an expert, but I *think* you need to more carefully consider
what rcs does. If you are using rcs from the  Design package, that is:
  It is a restricted cubic spline.  It is not similar to poly looking
for a parametric curve across the whole range. Rather, it is dividing
up the range of X into pieces and then fitting a cubic curve to each
one. There is not supposed to be a "linear" term.

A couple of years ago, I decided to try to master some of that
terminology and I wrote up something for my class.  I'm pretty sure I
got the rcs part correct.

http://pj.freefaculty.org/stat/Splines/nonparametricModels.pdf

Look down to p. 17, where (I see now) I was curious about where the
linear term went, just like you are now.

Good luck

pj

  
    
#
Dear Paul,
Thank you very much for the document, it is of great help :-)

I guess what I mean by the linear term is the b1xi in the equation yi
= ?b0 + ?b1 xi + ?b2 (xi ? ?1 )3 + ?b3 (xi ? ?2 )3  + + . . .

The summary output for lmer with rcs() only seems to give statistics
for the b(xi ? ?)3 s, but not for the b1xi term.

Thanks again for that document, it's great!

Antoine
On Tue, Nov 17, 2009 at 11:19 PM, Paul Johnson <pauljohn32 at gmail.com> wrote: