Dear All,
For a data mining project, I am relying heavily on the RandomForest and
Party packages.
Due to the large size of the data set, I have often memory problems (in
particular with the Party package; RandomForest seems to use less memory).
I really have two questions at this point
1) Please see how I am using the Party and RandomForest packages. Any
comment is welcome and useful.
myparty <- cforest(SalePrice ~ ModelID+
ProductGroup+
ProductGroupDesc+MfgYear+saledate3+saleday+
salemonth,
data = trainRF,
control = cforest_unbiased(mtry = 3, ntree=300, trace=TRUE))
rf_model <- randomForest(SalePrice ~ ModelID+
ProductGroup+
ProductGroupDesc+MfgYear+saledate3+saleday+
salemonth,
data = trainRF,na.action = na.omit,
importance=TRUE, do.trace=100, mtry=3,ntree=300)
2) I have another question: sometimes R crashes after telling me that it
is unable to allocate e.g. an array of 1.5 Gb.
However, I have 4Gb of ram on my box, so...technically the memory is
there, but is there a way to enable R to use more of it?
Many thanks
Lorenzo
RandomForest, Party and Memory Management
3 messages · Lorenzo Isella, Jeff Newmiller, Brian Ripley
Neither of your questions meets the Posting Guidelines (see footer of any email). 1) Not reproducible. [1] 2) Very operating-system specific and a FAQ. You have not indicated what your OS is (via sessionInfo), nor what reading you have done to address memory problems already (use a search engine... or begin with the FAQs in R help or on CRAN). [1] http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity.
Lorenzo Isella <lorenzo.isella at gmail.com> wrote:
Dear All,
For a data mining project, I am relying heavily on the RandomForest and
Party packages.
Due to the large size of the data set, I have often memory problems (in
particular with the Party package; RandomForest seems to use less
memory).
I really have two questions at this point
1) Please see how I am using the Party and RandomForest packages. Any
comment is welcome and useful.
myparty <- cforest(SalePrice ~ ModelID+
ProductGroup+
ProductGroupDesc+MfgYear+saledate3+saleday+
salemonth,
data = trainRF,
control = cforest_unbiased(mtry = 3, ntree=300, trace=TRUE))
rf_model <- randomForest(SalePrice ~ ModelID+
ProductGroup+
ProductGroupDesc+MfgYear+saledate3+saleday+
salemonth,
data = trainRF,na.action = na.omit,
importance=TRUE, do.trace=100, mtry=3,ntree=300)
2) I have another question: sometimes R crashes after telling me that
it
is unable to allocate e.g. an array of 1.5 Gb.
However, I have 4Gb of ram on my box, so...technically the memory is
there, but is there a way to enable R to use more of it?
Many thanks
Lorenzo
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Sun, 3 Feb 2013, Lorenzo Isella wrote:
Dear All,
For a data mining project, I am relying heavily on the RandomForest and Party
packages.
Due to the large size of the data set, I have often memory problems (in
particular with the Party package; RandomForest seems to use less memory). I
really have two questions at this point
1) Please see how I am using the Party and RandomForest packages. Any comment
is welcome and useful.
myparty <- cforest(SalePrice ~ ModelID+
ProductGroup+
ProductGroupDesc+MfgYear+saledate3+saleday+
salemonth,
data = trainRF,
control = cforest_unbiased(mtry = 3, ntree=300, trace=TRUE))
rf_model <- randomForest(SalePrice ~ ModelID+
ProductGroup+
ProductGroupDesc+MfgYear+saledate3+saleday+
salemonth,
data = trainRF,na.action = na.omit,
importance=TRUE, do.trace=100, mtry=3,ntree=300)
2) I have another question: sometimes R crashes after telling me that it is
unable to allocate e.g. an array of 1.5 Gb.
Do not use the word 'crash': see the posting guide. I suspect it gives you an error message.
However, I have 4Gb of ram on my box, so...technically the memory is there, but is there a way to enable R to use more of it?
Yes. I am surmising this is Windows but you have not told us so. See the rw-FAQ. The real answer is to run a 64-bit OS: your computer may have 4GB of RAM, but your OS has a 2GB address space which could be raised to 3GB.
Many thanks Lorenzo
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595