Message-ID: <6731304c0905150726h366f2f20g1f94fa3c93d67e40@mail.gmail.com>
Date: 2009-05-15T14:26:24Z
From: Max Kuhn
Subject: Using sample to create Training and Test sets
In-Reply-To: <39B6DDB9048D0F4DAD42CB26AAFF0AFA0744942C@usctmx1106.merck.com>
>> Forgive the newbie question, I want to select random rows from my
>> data.frame to create a test set (which I can do) but then I want to
>> create a training set using whats left over.
>>
The caret package has a function, createDataPartition, that does the
split taking into account the distribution of the outcome. This might
be good in classification cases where one or more classes have low
percentages in the data set.
There is more detail in the pdf:
http://cran.r-project.org/web/packages/caret/vignettes/caretMisc.pdf
and examples in this pdf
http://cran.r-project.org/web/packages/caret/vignettes/caretTrain.pdf
Max