lm on matrix data - R-help | R Mailing Lists

Wed, Oct 10, 2012 7:35 AM #

Hi,

I have a question about using lm on matrix, have to admit it is very
trivial but I just couldn't find the answer after searched the mailing
list and other online tutorial. It would be great if you could help.

I have a matrix "trainx" of 492(rows) by 220(columns) that is my x,
and trainy is 492 by 1. Also, I have the newdata testx which is 240
(rows) by 220 (columns). Here is what I got:

py <- predict(lm(trainy ~ trainx ), data.frame(testx))
Warning message:
'newdata' had 240 rows but variable(s) found have 492 rows

The fitting formula I intended is: trainy ~ trainx[,1] + trainx[,2] +
.. +trainx[,220].

Any help, please?

Best,
Baoqiang

R. Michael Weylandt

Wed, Oct 10, 2012 3:33 PM #

On Wed, Oct 10, 2012 at 3:35 PM, Baoqiang Cao <bqcaomail at gmail.com> wrote:

I think you want a formula like

trainy ~ .

meaning "trainy" explained by everything else. (Admittedly, I think
any model with 220 regressors is going to be absolutely terrible, but
that's a different email)

What I think is happening here is that lm() looks for "trainx" as a
column name in the data set you provide, can't find it, and then finds
the "trainx" dataset as a whole, which doesn't fit the dimensionality
you need. Take a look at ?formula for more on how to use formula
notation properly.

Cheers,
Michael

Jean V Adams

Thu, Oct 11, 2012 6:30 AM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20121011/fd5a3e7b/attachment.pl>