Skip to content

Specify model with polynomial interaction terms up to degree n

15 messages · Bert Gunter, Rui Barradas, David Winsemius +3 more

YTP
#
I would like to specify a model with all polynomial interaction terms between
two variables, say, up to degree 6. For example, terms like a^6 + (a^5 *
b^1)  +  (a^4 * b^2) + ... and so on.  The documentation states

The ^ operator indicates crossing to the specified degree.

so I would expect a model specified as y ~ (a+b)^6 to produce these terms.
However doing this only returns four slope coefficients, for Intercept, a,
b, and a:b.  Does anyone know how to produce the desired result? Thanks in
advance.

--
View this message in context: http://r.789695.n4.nabble.com/Specify-model-with-polynomial-interaction-terms-up-to-degree-n-tp4635130.html
Sent from the R help mailing list archive at Nabble.com.
#
On Jul 2, 2012, at 9:29 AM, YTP wrote:

            
You might try:

poly(a,6)*poly(b,6)

(untested   ... and it looks somewhat dangerous to me.)
#
On Jul 2, 2012, at 10:51 AM, David Winsemius wrote:

            
Well, now it's tested and succeeds at least numerically. Also tested

( poly(a,6) +poly(b,6) )^2 with identical results.

Whether this is wise practice remains in doubt:

dfrm <- data.frame(out=rnorm(100), a=rnorm(100), b=rnorm(100) )
anova(lm( out ~ (poly(a,6) +poly(b,6) )^2, data=dfrm) )
#-----------------------
Analysis of Variance Table

Response: out
                       Df Sum Sq Mean Sq F value  Pr(>F)
poly(a, 6)             6 12.409 2.06810  3.0754 0.01202 *
poly(b, 6)             6  5.321 0.88675  1.3187 0.26596
poly(a, 6):poly(b, 6) 36 41.091 1.14142  1.6974 0.04069 *
Residuals             51 34.295 0.67246
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
#
Hello,

Another way is to cbind the vectors 'a' and 'b', but this needs argument 
'raw' set to TRUE.

poly(cbind(a, b), 6, raw=TRUE)

To the OP: is this time series related? With 6 being a lag or test 
(e.g., Tsay, 1986) order? I'm asking this because package nlts has a 
function for this test up to order 5 and it uses poly().

Hope this helps,

Rui Barradas

Em 02-07-2012 16:04, David Winsemius escreveu:
YTP
#
I am not sure that method works. It appears to be doing something close, but
returns too many slope coefficients, since I think it is returning
interaction terms of degree smaller and greater than what was passed to it.

 Here is a small example of degree 2:
X1 X2
1  1  4
2  2  5
3  3  6
# We see slope coefficients returned for interaction terms of degree 1, 3,
and 4.
Estimate Std. Error z value Pr(>|z|)
(Intercept)                   -7.855e+00  4.588e+04       0        1
poly(X$X1, 2)1                 6.059e-15  7.946e+04       0        1
poly(X$X1, 2)2                -3.848e+01  7.946e+04       0        1
poly(X$X2, 2)1                        NA         NA      NA       NA
poly(X$X2, 2)2                        NA         NA      NA       NA
poly(X$X1, 2)1:poly(X$X2, 2)1         NA         NA      NA       NA
poly(X$X1, 2)2:poly(X$X2, 2)1         NA         NA      NA       NA
poly(X$X1, 2)1:poly(X$X2, 2)2         NA         NA      NA       NA
poly(X$X1, 2)2:poly(X$X2, 2)2         NA         NA      NA       NA





# Data used in the model
y poly(X$X1, 2).1 poly(X$X1, 2).2 poly(X$X2, 2).1 poly(X$X2, 2).2
1 0   -7.071068e-01    4.082483e-01   -7.071068e-01    4.082483e-01
2 1    4.350720e-18   -8.164966e-01    4.350720e-18   -8.164966e-01
3 0    7.071068e-01    4.082483e-01    7.071068e-01    4.082483e-01


where instead I would expect the data to be like

1  4   16
4  10  25
9  18  36


Any other ideas?  Many thanks.



--
View this message in context: http://r.789695.n4.nabble.com/Specify-model-with-polynomial-interaction-terms-up-to-degree-n-tp4635130p4635196.html
Sent from the R help mailing list archive at Nabble.com.
#
On Jul 2, 2012, at 4:13 PM, YTP wrote:

            
No. It _was_ doing what you asked for. I think you don't understand  
the expansion.
Degree 3 and 4 ???  That wasn't how I would have described the results  
(there being only first and second order terms as determined by the  
'degree' argument), but the fact remains.... You cannot estimate more  
than three parameters with a dataset of only 3 points. That is basic  
math.

It seems you have demonstrated that these weapons are "too sharp" to  
be wielded safely in your hands, so maybe you should step back a few  
paces and re-consider what you are really trying to accomplish. Is the  
goal really curve fitting with with N^2 polynomial parameters. And do  
you _really_ want to be describing to your audience the interpretation  
of models created with a logit link that have high degree polynomial  
terms? Or are you instead interested in flexible regression for  
purposes of description or exploratory analyses such as provided by  
loess (one form of local regression) or perhaps GAM's with smoothing  
terms?

test = data.frame(y = rnorm(1000),X1=rnorm(1000), X2=rnorm(1000) )
plot.new()
persp( x= seq(-2, 2,by=0.1),  y= seq(-2, 2,by=0.1),
        z= predict(mymodel2,
           newdata =expand.grid(X1=seq(-2, 2,by=0.1), X2=seq(-2,  
2,by=0.1)) ) ,
        ticktype="detailed")

This has the advantage that the polynomial basis is hidden and you  
only end up looking at the predictions. I also like working with Frank  
Harrell's 'rms' package because his perimeter function restricts  
plotted estimates to regions with sufficient data and the restricted  
cubic splines have less tendency to blow-up near the boundaries of hte  
data.
Orthogonal basis.
?poly

You can get something like that with poly(X1, degree=2, raw=TRUE)

 > poly(1:3, degree=2, raw=TRUE)
      1 2
[1,] 1 1
[2,] 2 4
[3,] 3 9
attr(,"degree")
[1] 1 2
attr(,"class")
[1] "poly"   "matrix"
YTP
#
I think you have taken my toy example seriously. Perhaps I wasn't clear, but
I am in fact not working with a dataset of 3 observations of the numbers 1
through 6 and trying to estimate anything; that was an example to illustrate
what I am asking for, namely, turning two variables like this

 X1 X2
  1  4
  2  5
  3  6 


into a dataset like this

 1  4   16
 4  10  25
 9  18  36 


where each column is the interaction between certain polynomial terms of the
original variables, such that each column has a sum of exponents equal to 2
(or whatever degree n is desired). My apologies if I wasn't clear, any other
ideas would be appreciated. It appears the poly command you mentioned is
only taking powers of a single vector.



--
View this message in context: http://r.789695.n4.nabble.com/Specify-model-with-polynomial-interaction-terms-up-to-degree-n-tp4635130p4635217.html
Sent from the R help mailing list archive at Nabble.com.
#
Hello,

Sorry, but it was you that misread some of the suggestions. I have 
written raw=TRUE not raw=raw. Just see


m <- matrix(1:6, ncol=2)  # your example

p2 <- poly(m, degree=2, raw=TRUE) # it's raw=TRUE, not raw=raw !!!
deg2 <- attr(p2, 'degree') == 2
p2[, deg2]

p6 <- poly(m, degree=6, raw=TRUE) # now degree 6
deg6 <- attr(p6, 'degree') == 6
p6[, deg6]


Hope this helps,

Rui Barradas

Em 02-07-2012 23:52, YTP escreveu:
4 days later
YTP
#
Hi Rui, 

Thanks for responding. I did not write "raw=raw", and I'm not sure why R
would return such a misleading error message. Indeed, the same error message
comes up when I run the 2nd part of your code:
Error in poly(dots[[1L]], degree, raw = raw) : 
  'degree' must be less than number of unique points

However I can think of a workaround for this, and I get the main idea from
your example, which does exactly what I was hoping for, thanks very much!

--
View this message in context: http://r.789695.n4.nabble.com/Specify-model-with-polynomial-interaction-terms-up-to-degree-n-tp4635130p4635732.html
Sent from the R help mailing list archive at Nabble.com.
#
On 2012-07-07 14:56, YTP wrote:
Are you sure that you submitted the poly() exactly as
you show above? I can get your error with raw=FALSE or
NULL or NA or 0 but not with raw=TRUE (or any nonzero
number).

This is with R version 2.15.1 Patched (2012-06-27 r59661).

Peter Ehlers
#
Hello,


Em 08-07-2012 03:00, Peter Ehlers escreveu:
It works with me too.
This has some days already, but I remember the error message and that I 
had tested it before posting the code above.


sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=Portuguese_Portugal.1252 
LC_CTYPE=Portuguese_Portugal.1252
[3] LC_MONETARY=Portuguese_Portugal.1252 LC_NUMERIC=C 

[5] LC_TIME=Portuguese_Portugal.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base


Rui Barradas
2 days later
#
a) Please supply some context in your mail message.  Not everyone reads R-help via nabble.

b) poly(raw=TRUE, x, degree=degree) was changed for 2.15.0 to allow it to output
a non-full-rank matrix.  See the NEWS file in 2.15.0 or after:
  > # in R-2.15.1
  > n <- news()
  > n[grepl("poly", n$Text),]
  Changes in version 2.15.0:  
  
  NEW FEATURES
  
      o   poly(raw = TRUE) no longer requires more unique points than the
          degree.  (Requested by John Fox.)
  ...

c) Error messages include the innermost R function call before the error.  To see more call traceback(),
which will show the call stack from the innermost call back to the call you made:

  > # in R-2.14.1
  > m <- matrix(1:6, ncol=2)
  > poly(m, degree=6, raw=TRUE)
  Error in poly(dots[[1L]], degree, raw = raw) :
    'degree' must be less than number of unique points
  > traceback()
  6: stop("'degree' must be less than number of unique points")
  5: poly(dots[[1L]], degree, raw = raw)
  4: cbind(1, poly(dots[[1L]], degree, raw = raw))
  3: polym(V1 = 1:3, V2 = 4:6, degree = 6, raw = TRUE)
  2: do.call("polym", c(m, degree = degree, raw = raw))
  1: poly(m, degree = 6, raw = TRUE)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com