Apologies for catching this so late. Have been out for a few weeks and still trying to recover from that... From: Jim Porzak
Hi Wanghong,
Unless you have a huge linux box, you will need to sample
down your 300k
rows to a few thousand.
In marketing aps, I often have data sets of comparable size.
I would suggest you start with a just a few k rows to make
sure everything
else is working as you wish. Also, study carefully Andy's
randomForest docs
- including the R News article a couple years ago.
In particular,
1) the formula interface is a memory hog. Andy suggests just
using explicit
declaration. In you case, something like
randomForest(Memebers[42], Memebers[-42], ...
Actually that first argument probably should be Members[[42]]. I believe you get a data frame with one variable if you do Members[42]. Best, Andy
2) proximity matirx is also memory & time intensive. Suggest proximity = FALSE until, other things sorted out. HTH, Jim Porzak TGN.com San Francisco, CA http://www.linkedin.com/in/jimporzak useR Group SF: http://ia.meetup.com/67/ 2008/12/26 wanghong <wanghong at neusoft.edu.cn>
hello, I want to use randomForest to classify a matrix which is
331030??42,the last
column is class signal.I use ??
Memebers.rf<-randomForest(class~.,data=Memebers,proximity=TRUE ,mtry=6,ntree=200)
which told me" the error is matrix(0,n,n) set too elements" then I use:
Memebers.rf<-randomForest(class~.,data=Memebers,importance=TRU
E,proximity=TRUE)
which told me"the error is na.fail.default(list(class =
c(17L, 17L, 17L,
29L, 29L, 29L, : missing values in object " what's wrong with it .Thanks a lot wanghong wanghong at neusoft.edu.cn 2008-12-26
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
Notice: This e-mail message, together with any attachme...{{dropped:12}}