randomForest
Uwe Ligges wrote:
Anirudh Kondaveeti wrote:
To be more clear, My data set contains two classes.. Class 1 and Class 2 Class 1 has original data with 300 rows Class 2 is randomly generated data with 1500 rows. I want to sample a new data set with Class 1 - all the rows Class 2 - only 300 rows out of 1500 rows and then use it in random forest with 500 trees. Also the Class 2 should have different 300 rows for different trees in the forest. Thanks!
Ah, in that case (stratified sampling) combine arguments "strata" and "sampsize", in principle, but you cannot select ALL rows of one class: you somehow ignore one of the main ideas of randomForests to bootstrap observations - and randomForest will certainly bootstrap for you.
In fact, you can also use replace = FALSE as well, but then, as I said, one of the main ideas of randomForest is ignored.... Uwe Ligges
Uwe Ligges
Anirudh Kondaveeti ---------------------------- On Fri, Mar 20, 2009 at 1:45 PM, Anirudh Kondaveeti < anirudh.kondaveeti at gmail.com> wrote:
sampsize uses the same sample for all the trees in the random Forest. But I want to use different sample for each tree of the 500 trees in the random Forest. Thanks! Anirudh Kondaveeti ---------------------------- 2009/3/20 Uwe Ligges <ligges at statistik.tu-dortmund.de>
Anirudh Kondaveeti wrote:
Hi! I am dealing with random forest using R. Is there a way to sample a fixed no.of rows from a dataset for use with different trees in random Forest. To be more clear, my data set contains 1500 rows, and I am growing 500 trees in Random Forest Is it possible to sample only 500 rows of data from the data set and use it for different trees in the forest. I mean each tree of the forest should use a different 500 rows from the data set.
See ?randomForest and the argument sampsize. Uwe Ligges
Thanks in advance!
Anirudh Kondaveeti
----------------------------
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.