Re : Running random forest using different training andtesting schemes

Mon, Apr 13, 2009 6:14 AM

The R News article we put out after the first version of the package was released has examples of doing CV.  You can also use the facilities in the caret package (on CRAN) or the MLInterface package (part of Bioconductor, not on CRAN).

randomForest() itself does not do CV per se, but the OOB estimates are very close to what you'd get from CV, without all the work.

Andy

From: Chrysanthi A.

Hi Pierre,

Thanks a lot for your help..
So, using that script, I just separate my data in two parts, 
right? For
using as training set the 70 % of the data and the rest as 
test, should I
multiply the n with the 0.70 (for this case)?

Many thanks,

Chrysanthi



2009/4/12 Pierre Moffard <pier.moff at yahoo.fr>

Hi Chysanthi,

check out the randomForest package, with the function

randomForest. It has

a CV option. Sorry for not providing you with a lengthier

response at the

moment but I'm rather busy on a project. Let me know if you

need more help.

Also, to split your data into two parts- the training and

the test set you

can do (n the number of data points):
n<-length(data[,1])
indices<-sample(rep(c(TRUE,FALSE),each=n/2),round(n/2),replace=TRUE)
training_indices<-(1:n)[indices]
test_indices<-(1:n)[!indices]
Then, data[train,] is the training set and data[test,] is

the test set.

Best,
Pierre
------------------------------
*De :* Chrysanthi A. <chrysain at gmail.com>
*? :* r-help at r-project..org
*Envoy? le :* Dimanche, 12 Avril 2009, 17h26mn 59s
*Objet :* [R] Running random forest using different

training and testing

schemes

Hi,

I would like to run random Forest classification algorithm

and check the

accuracy of the prediction according to different training

and testing

schemes. For example, extracting 70% of the samples for

training and the

rest for testing, or using 10-fold cross validation scheme.
How can I do that? Is there a function?

Thanks a lot,

Chrysanthi.

    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Notice:  This e-mail message, together with any attachme...{{dropped:12}}

Re : Running random forest using different training andtesting schemes

Thread (7 messages)