Hi all,
I have one array of predictors, one observation per row, and one array
of responses, also arranged one observation per row. I arrange these
into a data.frame and call lm() with a pasted-together formula.
I would like to call lm() with a number of responses in excess of 100,
but for some reason, 39 seems to be a limit. Why do I get an "invalid
variable names" error from model.frame() when supplying 40 or more
responses? As a workaround, I can loop through groups of 39 responses
in separate calls to lm(), but that seems inefficient and possibly
version- or platform-dependent.
Here is my best effort at a minimal example showing the problem.
--- begin pasted R session ---
> test.this <- function(n.resp, n.obs, n.pred) {
+ my.resp <- matrix(runif(n.resp * n.obs), nrow=n.obs)
+ my.resp.names <- paste("Response", 1:n.resp, sep=".")
+ my.pred <- matrix(runif(n.pred * n.obs), nrow=n.obs)
+ my.pred.names <- paste("Predictor", 1:n.pred, sep=".")
+ my.formula <- as.formula(paste("cbind(",
+ paste(my.resp.names, collapse=", "), ") ~ ",
+ paste(my.pred.names, collapse=" + ")))
+ d.tmp <- cbind(my.pred, my.resp)
+ d.tmp <- as.data.frame(d.tmp)
+ names(d.tmp) <- c(my.pred.names, my.resp.names)
+ my.lm <- lm(my.formula, data=d.tmp, model=F, qr=F, x=F, y=F,
+ na.action=na.exclude)
+ my.lm
+ }
> # Now, try it. 39 response vectors is OK, but 40 causes an error:
> m1 <- test.this(40, 10, 2)
Error in model.frame(formula, rownames, variables, varnames, extras,
extranames, :
invalid variable names
> m1 <- test.this(39, 10, 2)
> # No error for n.resp == 39.
> # Also, shouldn't "qr=F" in the call to lm() turn off output of m1$qr?
> # m1$qr exists. I'd like to save memory and omit it if possible.
> str(m1$qr)
List of 5
$ qr : num [1:10, 1:3] -3.162 0.316 0.316 0.316 0.316 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:10] "1" "2" "3" "4" ...
.. ..$ : chr [1:3] "(Intercept)" "Predictor.1" "Predictor.2"
..- attr(*, "assign")= int [1:3] 0 1 2
$ qraux: num [1:3] 1.32 1.34 1.42
$ pivot: int [1:3] 1 2 3
$ tol : num 1e-07
$ rank : int 3
- attr(*, "class")= chr "qr"
> # Here's my version:
> version
_
platform i386-pc-mingw32
arch i386
os mingw32
system i386, mingw32
status
major 2
minor 0.1
year 2004
month 11
day 15
language R
--- end pasted R session ---
Best regards,
John
lm() with many responses
2 messages · John Pitney, Brian Ripley
On Tue, 12 Apr 2005, John Pitney wrote:
I have one array of predictors, one observation per row, and one array of responses, also arranged one observation per row. I arrange these into a data.frame and call lm() with a pasted-together formula. I would like to call lm() with a number of responses in excess of 100, but for some reason, 39 seems to be a limit. Why do I get an "invalid variable names" error from model.frame() when supplying 40 or more responses?
Your expression is too long. Create the response matrix and pass that to the formula, rather than passing an expression. There is a 500-char internal limit on variable names in model.frame.default. That should be enough ....
As a workaround, I can loop through groups of 39 responses in separate calls to lm(), but that seems inefficient and possibly version- or platform-dependent. Here is my best effort at a minimal example showing the problem.
It's not easy to cut-and-paste, though.
--- begin pasted R session ---
test.this <- function(n.resp, n.obs, n.pred) {
+ my.resp <- matrix(runif(n.resp * n.obs), nrow=n.obs)
+ my.resp.names <- paste("Response", 1:n.resp, sep=".")
+ my.pred <- matrix(runif(n.pred * n.obs), nrow=n.obs)
+ my.pred.names <- paste("Predictor", 1:n.pred, sep=".")
+ my.formula <- as.formula(paste("cbind(",
+ paste(my.resp.names, collapse=", "), ") ~ ",
+ paste(my.pred.names, collapse=" + ")))
+ d.tmp <- cbind(my.pred, my.resp)
+ d.tmp <- as.data.frame(d.tmp)
+ names(d.tmp) <- c(my.pred.names, my.resp.names)
+ my.lm <- lm(my.formula, data=d.tmp, model=F, qr=F, x=F, y=F,
+ na.action=na.exclude)
+ my.lm
+ }
# Now, try it. 39 response vectors is OK, but 40 causes an error: m1 <- test.this(40, 10, 2)
Error in model.frame(formula, rownames, variables, varnames, extras,
extranames, :
invalid variable names
m1 <- test.this(39, 10, 2) # No error for n.resp == 39. # Also, shouldn't "qr=F" in the call to lm() turn off output of m1$qr?
Only if it were implemented.
# m1$qr exists. I'd like to save memory and omit it if possible. str(m1$qr)
List of 5 $ qr : num [1:10, 1:3] -3.162 0.316 0.316 0.316 0.316 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:10] "1" "2" "3" "4" ... .. ..$ : chr [1:3] "(Intercept)" "Predictor.1" "Predictor.2" ..- attr(*, "assign")= int [1:3] 0 1 2 $ qraux: num [1:3] 1.32 1.34 1.42 $ pivot: int [1:3] 1 2 3 $ tol : num 1e-07 $ rank : int 3 - attr(*, "class")= chr "qr"
# Here's my version: version
_ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 0.1 year 2004 month 11 day 15 language R --- end pasted R session --- Best regards, John
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595