Tip: I() can designate constants in a regression
David James <djames at frontierassoc.com> writes:
Just thought I would share a tip I learned: The function I() is useful for specifying constants to formulas and regressions. It will prevent nls (for example) from trying to treat the variable inside I() as something it needs to estimate. An example is below. -David P.S. This may be obvious to some, but it is not made clear to be by the documentation or common books that I reviewed. These books, of course, do tend to mention others aspects of I(), which seems to be a very diverse function. For example: * ISwR by Dalgaard (p. 160, 177) * MASwS by Venables and Ripley (p.18) However, the books I looked at do not mention the specific tip here: Wrapping I() around a variable will make it a constant from the perspective of a regression. A humble suggestion to the many authors of the many great R and S books out there: I would find it helpful if more R books had the word "constants" in the index. Perhaps there could be a brief section that explained how to create constants in a regression. These sorts of problems, I would guess, occur more commonly with nls models than lm models.
First check whether your claim is actually correct:
x = 1:10
y = x # perfect fit
yeps = y + rnorm(length(y), sd = 0.01) # added noise
nls(yeps ~ a + b*x, start = list(a = 0.12345, b = 0.54321),
+ trace = TRUE)
74.2686 : 0.12345 0.54321
0.0006529895 : -0.002666984 1.000334031
Nonlinear regression model
model: yeps ~ a + b * x
data: parent.frame()
a b
-0.002666984 1.000334031
residual sum-of-squares: 0.0006529895
a <- 0
nls(yeps ~ a + b*x, start = list(b = 0.54321),trace=TRUE)
80.31713 : 0.54321
0.0006682311 : 0.999953
Nonlinear regression model
model: yeps ~ a + b * x
data: parent.frame()
b
0.999953
residual sum-of-squares: 0.0006682311
I.e., turning a into a constant works quite happily without the I().
Here is the example that motivated my tip:
weather.df : a data frame, where each row is one hour weather.df$temp : the temperature weather.df$annual : time offset, adjusted so that its period is one year weather.df$daily : time offset, adjusted so that its period is one day # I want a1,a2 to be constants from the point of view of nls a1 <- 66 a2 <- -18 nls.example <- nls( temp ~ I(a1) + I(a2)*sin( ts.annual ) + a3*sin ( ts.daily ), data=weather.df, start=c(a3=1) ) # leaving out the I() will cause nls to estimate values for a1 and a2
O__ ---- Peter Dalgaard ??ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907