Skip to content

split data into training and testing sets

3 messages · Dhiren DSouza, Sundar Dorai-Raj, Brian Ripley

#
How can I split a dataset randomly into a training and testing set.  I would 
like to have the ability to specify the size of the training set and use the 
remaining data as the testing set.

For example 90% training data and 10% testing data split.  Is there a 
function that will accomplish this?

Thank you,

-Dhiren

Rutgers University
Graduate Student
#
Dhiren DSouza wrote:
See ?sample.

sub <- sample(nrow(x), floor(nrow(x) * 0.9))
training <- x[sub, ]
testing <- x[-sub, ]

HTH,

--sundar
#
On Fri, 11 Nov 2005, Dhiren DSouza wrote:

            
Yes, see ?sample: use it to sample indices.
There are lots of examples around, e.g. in ?lda.