I would like to specify a model with all polynomial interaction terms between two variables, say, up to degree 6. For example, terms like a^6 + (a^5 * b^1) + (a^4 * b^2) + ... and so on. The documentation states The ^ operator indicates crossing to the specified degree. so I would expect a model specified as y ~ (a+b)^6 to produce these terms. However doing this only returns four slope coefficients, for Intercept, a, b, and a:b. Does anyone know how to produce the desired result? Thanks in advance. -- View this message in context: http://r.789695.n4.nabble.com/Specify-model-with-polynomial-interaction-terms-up-to-degree-n-tp4635130.html Sent from the R help mailing list archive at Nabble.com.
Specify model with polynomial interaction terms up to degree n
15 messages · Bert Gunter, Rui Barradas, David Winsemius +3 more
On Jul 2, 2012, at 9:29 AM, YTP wrote:
I would like to specify a model with all polynomial interaction terms between two variables, say, up to degree 6. For example, terms like a^6 + (a^5 * b^1) + (a^4 * b^2) + ... and so on. The documentation states The ^ operator indicates crossing to the specified degree. so I would expect a model specified as y ~ (a+b)^6 to produce these terms. However doing this only returns four slope coefficients, for Intercept, a, b, and a:b. Does anyone know how to produce the desired result? Thanks in advance.
You might try: poly(a,6)*poly(b,6) (untested ... and it looks somewhat dangerous to me.)
David Winsemius, MD West Hartford, CT
On Jul 2, 2012, at 10:51 AM, David Winsemius wrote:
On Jul 2, 2012, at 9:29 AM, YTP wrote:
I would like to specify a model with all polynomial interaction terms between two variables, say, up to degree 6. For example, terms like a^6 + (a^5 * b^1) + (a^4 * b^2) + ... and so on. The documentation states The ^ operator indicates crossing to the specified degree. so I would expect a model specified as y ~ (a+b)^6 to produce these terms. However doing this only returns four slope coefficients, for Intercept, a, b, and a:b. Does anyone know how to produce the desired result? Thanks in advance.
You might try: poly(a,6)*poly(b,6) (untested ... and it looks somewhat dangerous to me.)
Well, now it's tested and succeeds at least numerically. Also tested
( poly(a,6) +poly(b,6) )^2 with identical results.
Whether this is wise practice remains in doubt:
dfrm <- data.frame(out=rnorm(100), a=rnorm(100), b=rnorm(100) )
anova(lm( out ~ (poly(a,6) +poly(b,6) )^2, data=dfrm) )
#-----------------------
Analysis of Variance Table
Response: out
Df Sum Sq Mean Sq F value Pr(>F)
poly(a, 6) 6 12.409 2.06810 3.0754 0.01202 *
poly(b, 6) 6 5.321 0.88675 1.3187 0.26596
poly(a, 6):poly(b, 6) 36 41.091 1.14142 1.6974 0.04069 *
Residuals 51 34.295 0.67246
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
David Winsemius, MD West Hartford, CT
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120702/d72a5e17/attachment.pl>
Hello, Another way is to cbind the vectors 'a' and 'b', but this needs argument 'raw' set to TRUE. poly(cbind(a, b), 6, raw=TRUE) To the OP: is this time series related? With 6 being a lag or test (e.g., Tsay, 1986) order? I'm asking this because package nlts has a function for this test up to order 5 and it uses poly(). Hope this helps, Rui Barradas Em 02-07-2012 16:04, David Winsemius escreveu:
On Jul 2, 2012, at 10:51 AM, David Winsemius wrote:
On Jul 2, 2012, at 9:29 AM, YTP wrote:
I would like to specify a model with all polynomial interaction terms between two variables, say, up to degree 6. For example, terms like a^6 + (a^5 * b^1) + (a^4 * b^2) + ... and so on. The documentation states The ^ operator indicates crossing to the specified degree. so I would expect a model specified as y ~ (a+b)^6 to produce these terms. However doing this only returns four slope coefficients, for Intercept, a, b, and a:b. Does anyone know how to produce the desired result? Thanks in advance.
You might try: poly(a,6)*poly(b,6) (untested ... and it looks somewhat dangerous to me.)
Well, now it's tested and succeeds at least numerically. Also tested
( poly(a,6) +poly(b,6) )^2 with identical results.
Whether this is wise practice remains in doubt:
dfrm <- data.frame(out=rnorm(100), a=rnorm(100), b=rnorm(100) )
anova(lm( out ~ (poly(a,6) +poly(b,6) )^2, data=dfrm) )
#-----------------------
Analysis of Variance Table
Response: out
Df Sum Sq Mean Sq F value Pr(>F)
poly(a, 6) 6 12.409 2.06810 3.0754 0.01202 *
poly(b, 6) 6 5.321 0.88675 1.3187 0.26596
poly(a, 6):poly(b, 6) 36 41.091 1.14142 1.6974 0.04069 *
Residuals 51 34.295 0.67246
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
I am not sure that method works. It appears to be doing something close, but returns too many slope coefficients, since I think it is returning interaction terms of degree smaller and greater than what was passed to it. Here is a small example of degree 2:
X = data.frame(cbind(c(1,2,3), c(4,5,6))) X
X1 X2 1 1 4 2 2 5 3 3 6
y = c(0,1,0)
mymodel2 = glm(y ~ poly(X$X1,2)*poly(X$X2,2), family=binomial(link="logit"))
# We see slope coefficients returned for interaction terms of degree 1, 3, and 4.
summary(mymodel2)
Estimate Std. Error z value Pr(>|z|) (Intercept) -7.855e+00 4.588e+04 0 1 poly(X$X1, 2)1 6.059e-15 7.946e+04 0 1 poly(X$X1, 2)2 -3.848e+01 7.946e+04 0 1 poly(X$X2, 2)1 NA NA NA NA poly(X$X2, 2)2 NA NA NA NA poly(X$X1, 2)1:poly(X$X2, 2)1 NA NA NA NA poly(X$X1, 2)2:poly(X$X2, 2)1 NA NA NA NA poly(X$X1, 2)1:poly(X$X2, 2)2 NA NA NA NA poly(X$X1, 2)2:poly(X$X2, 2)2 NA NA NA NA # Data used in the model
mymodel2$model
y poly(X$X1, 2).1 poly(X$X1, 2).2 poly(X$X2, 2).1 poly(X$X2, 2).2 1 0 -7.071068e-01 4.082483e-01 -7.071068e-01 4.082483e-01 2 1 4.350720e-18 -8.164966e-01 4.350720e-18 -8.164966e-01 3 0 7.071068e-01 4.082483e-01 7.071068e-01 4.082483e-01 where instead I would expect the data to be like 1 4 16 4 10 25 9 18 36 Any other ideas? Many thanks. -- View this message in context: http://r.789695.n4.nabble.com/Specify-model-with-polynomial-interaction-terms-up-to-degree-n-tp4635130p4635196.html Sent from the R help mailing list archive at Nabble.com.
Hi Bert, thank you for pointing out that this method does not work. Do you happen to have any ideas as to how it could be done? Many thanks. -- View this message in context: http://r.789695.n4.nabble.com/Specify-model-with-polynomial-interaction-terms-up-to-degree-n-tp4635130p4635197.html Sent from the R help mailing list archive at Nabble.com.
On Jul 2, 2012, at 4:13 PM, YTP wrote:
I am not sure that method works. It appears to be doing something close, but returns too many slope coefficients, since I think it is returning interaction terms of degree smaller and greater than what was passed to it.
No. It _was_ doing what you asked for. I think you don't understand the expansion.
Here is a small example of degree 2:
X = data.frame(cbind(c(1,2,3), c(4,5,6))) X
X1 X2 1 1 4 2 2 5 3 3 6
y = c(0,1,0)
mymodel2 = glm(y ~ poly(X$X1,2)*poly(X$X2,2), family=binomial(link="logit"))
# We see slope coefficients returned for interaction terms of degree 1, 3, and 4.
Degree 3 and 4 ??? That wasn't how I would have described the results
(there being only first and second order terms as determined by the
'degree' argument), but the fact remains.... You cannot estimate more
than three parameters with a dataset of only 3 points. That is basic
math.
It seems you have demonstrated that these weapons are "too sharp" to
be wielded safely in your hands, so maybe you should step back a few
paces and re-consider what you are really trying to accomplish. Is the
goal really curve fitting with with N^2 polynomial parameters. And do
you _really_ want to be describing to your audience the interpretation
of models created with a logit link that have high degree polynomial
terms? Or are you instead interested in flexible regression for
purposes of description or exploratory analyses such as provided by
loess (one form of local regression) or perhaps GAM's with smoothing
terms?
test = data.frame(y = rnorm(1000),X1=rnorm(1000), X2=rnorm(1000) )
plot.new()
persp( x= seq(-2, 2,by=0.1), y= seq(-2, 2,by=0.1),
z= predict(mymodel2,
newdata =expand.grid(X1=seq(-2, 2,by=0.1), X2=seq(-2,
2,by=0.1)) ) ,
ticktype="detailed")
This has the advantage that the polynomial basis is hidden and you
only end up looking at the predictions. I also like working with Frank
Harrell's 'rms' package because his perimeter function restricts
plotted estimates to regions with sufficient data and the restricted
cubic splines have less tendency to blow-up near the boundaries of hte
data.
summary(mymodel2)
Estimate Std. Error z value Pr(>|z|)
(Intercept) -7.855e+00 4.588e+04 0 1
poly(X$X1, 2)1 6.059e-15 7.946e+04 0 1
poly(X$X1, 2)2 -3.848e+01 7.946e+04 0 1
poly(X$X2, 2)1 NA NA NA NA
poly(X$X2, 2)2 NA NA NA NA
poly(X$X1, 2)1:poly(X$X2, 2)1 NA NA NA NA
poly(X$X1, 2)2:poly(X$X2, 2)1 NA NA NA NA
poly(X$X1, 2)1:poly(X$X2, 2)2 NA NA NA NA
poly(X$X1, 2)2:poly(X$X2, 2)2 NA NA NA NA
# Data used in the model
mymodel2$model
y poly(X$X1, 2).1 poly(X$X1, 2).2 poly(X$X2, 2).1 poly(X$X2, 2).2 1 0 -7.071068e-01 4.082483e-01 -7.071068e-01 4.082483e-01 2 1 4.350720e-18 -8.164966e-01 4.350720e-18 -8.164966e-01 3 0 7.071068e-01 4.082483e-01 7.071068e-01 4.082483e-01
Orthogonal basis.
where instead I would expect the data to be like 1 4 16 4 10 25 9 18 36
?poly
You can get something like that with poly(X1, degree=2, raw=TRUE)
> poly(1:3, degree=2, raw=TRUE)
1 2
[1,] 1 1
[2,] 2 4
[3,] 3 9
attr(,"degree")
[1] 1 2
attr(,"class")
[1] "poly" "matrix"
David Winsemius, MD West Hartford, CT
I think you have taken my toy example seriously. Perhaps I wasn't clear, but I am in fact not working with a dataset of 3 observations of the numbers 1 through 6 and trying to estimate anything; that was an example to illustrate what I am asking for, namely, turning two variables like this X1 X2 1 4 2 5 3 6 into a dataset like this 1 4 16 4 10 25 9 18 36 where each column is the interaction between certain polynomial terms of the original variables, such that each column has a sum of exponents equal to 2 (or whatever degree n is desired). My apologies if I wasn't clear, any other ideas would be appreciated. It appears the poly command you mentioned is only taking powers of a single vector. -- View this message in context: http://r.789695.n4.nabble.com/Specify-model-with-polynomial-interaction-terms-up-to-degree-n-tp4635130p4635217.html Sent from the R help mailing list archive at Nabble.com.
Hello, Sorry, but it was you that misread some of the suggestions. I have written raw=TRUE not raw=raw. Just see m <- matrix(1:6, ncol=2) # your example p2 <- poly(m, degree=2, raw=TRUE) # it's raw=TRUE, not raw=raw !!! deg2 <- attr(p2, 'degree') == 2 p2[, deg2] p6 <- poly(m, degree=6, raw=TRUE) # now degree 6 deg6 <- attr(p6, 'degree') == 6 p6[, deg6] Hope this helps, Rui Barradas Em 02-07-2012 23:52, YTP escreveu:
I think you have taken my toy example seriously. Perhaps I wasn't clear, but I am in fact not working with a dataset of 3 observations of the numbers 1 through 6 and trying to estimate anything; that was an example to illustrate what I am asking for, namely, turning two variables like this X1 X2 1 4 2 5 3 6 into a dataset like this 1 4 16 4 10 25 9 18 36 where each column is the interaction between certain polynomial terms of the original variables, such that each column has a sum of exponents equal to 2 (or whatever degree n is desired). My apologies if I wasn't clear, any other ideas would be appreciated. It appears the poly command you mentioned is only taking powers of a single vector. -- View this message in context: http://r.789695.n4.nabble.com/Specify-model-with-polynomial-interaction-terms-up-to-degree-n-tp4635130p4635217.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
4 days later
Hi Rui, Thanks for responding. I did not write "raw=raw", and I'm not sure why R would return such a misleading error message. Indeed, the same error message comes up when I run the 2nd part of your code:
m <- matrix(1:6, ncol=2) p6 <- poly(m, degree=6, raw=TRUE)
Error in poly(dots[[1L]], degree, raw = raw) : 'degree' must be less than number of unique points However I can think of a workaround for this, and I get the main idea from your example, which does exactly what I was hoping for, thanks very much! -- View this message in context: http://r.789695.n4.nabble.com/Specify-model-with-polynomial-interaction-terms-up-to-degree-n-tp4635130p4635732.html Sent from the R help mailing list archive at Nabble.com.
On 2012-07-07 14:56, YTP wrote:
Hi Rui, Thanks for responding. I did not write "raw=raw", and I'm not sure why R would return such a misleading error message. Indeed, the same error message comes up when I run the 2nd part of your code:
m <- matrix(1:6, ncol=2) p6 <- poly(m, degree=6, raw=TRUE)
Error in poly(dots[[1L]], degree, raw = raw) : 'degree' must be less than number of unique points
Are you sure that you submitted the poly() exactly as you show above? I can get your error with raw=FALSE or NULL or NA or 0 but not with raw=TRUE (or any nonzero number). This is with R version 2.15.1 Patched (2012-06-27 r59661). Peter Ehlers
However I can think of a workaround for this, and I get the main idea from your example, which does exactly what I was hoping for, thanks very much! -- View this message in context: http://r.789695.n4.nabble.com/Specify-model-with-polynomial-interaction-terms-up-to-degree-n-tp4635130p4635732.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hello, Em 08-07-2012 03:00, Peter Ehlers escreveu:
On 2012-07-07 14:56, YTP wrote:
Hi Rui, Thanks for responding. I did not write "raw=raw", and I'm not sure why R would return such a misleading error message. Indeed, the same error message comes up when I run the 2nd part of your code:
m <- matrix(1:6, ncol=2) p6 <- poly(m, degree=6, raw=TRUE)
Error in poly(dots[[1L]], degree, raw = raw) : 'degree' must be less than number of unique points
Are you sure that you submitted the poly() exactly as you show above? I can get your error with raw=FALSE or NULL or NA or 0 but not with raw=TRUE (or any nonzero number). This is with R version 2.15.1 Patched (2012-06-27 r59661). Peter Ehlers
It works with me too. This has some days already, but I remember the error message and that I had tested it before posting the code above. sessionInfo() R version 2.15.0 (2012-03-30) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=Portuguese_Portugal.1252 LC_CTYPE=Portuguese_Portugal.1252 [3] LC_MONETARY=Portuguese_Portugal.1252 LC_NUMERIC=C [5] LC_TIME=Portuguese_Portugal.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base Rui Barradas
However I can think of a workaround for this, and I get the main idea from your example, which does exactly what I was hoping for, thanks very much! -- View this message in context: http://r.789695.n4.nabble.com/Specify-model-with-polynomial-interaction-terms-up-to-degree-n-tp4635130p4635732.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
2 days later
Yep, that code is verbatim what I typed in, using version 2.14 ... seems weird. -- View this message in context: http://r.789695.n4.nabble.com/Specify-model-with-polynomial-interaction-terms-up-to-degree-n-tp4635130p4636031.html Sent from the R help mailing list archive at Nabble.com.
a) Please supply some context in your mail message. Not everyone reads R-help via nabble.
b) poly(raw=TRUE, x, degree=degree) was changed for 2.15.0 to allow it to output
a non-full-rank matrix. See the NEWS file in 2.15.0 or after:
> # in R-2.15.1
> n <- news()
> n[grepl("poly", n$Text),]
Changes in version 2.15.0:
NEW FEATURES
o poly(raw = TRUE) no longer requires more unique points than the
degree. (Requested by John Fox.)
...
c) Error messages include the innermost R function call before the error. To see more call traceback(),
which will show the call stack from the innermost call back to the call you made:
> # in R-2.14.1
> m <- matrix(1:6, ncol=2)
> poly(m, degree=6, raw=TRUE)
Error in poly(dots[[1L]], degree, raw = raw) :
'degree' must be less than number of unique points
> traceback()
6: stop("'degree' must be less than number of unique points")
5: poly(dots[[1L]], degree, raw = raw)
4: cbind(1, poly(dots[[1L]], degree, raw = raw))
3: polym(V1 = 1:3, V2 = 4:6, degree = 6, raw = TRUE)
2: do.call("polym", c(m, degree = degree, raw = raw))
1: poly(m, degree = 6, raw = TRUE)
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of YTP Sent: Tuesday, July 10, 2012 11:23 AM To: r-help at r-project.org Subject: Re: [R] Specify model with polynomial interaction terms up to degree n Yep, that code is verbatim what I typed in, using version 2.14 ... seems weird. -- View this message in context: http://r.789695.n4.nabble.com/Specify-model-with- polynomial-interaction-terms-up-to-degree-n-tp4635130p4636031.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.