Skip to content
Prev 106577 / 398525 Next

Problem to generate training data set and test data set

Aimin Yan wrote:
Hi Aimin,
I haven't tested this exhaustively, but I think it does what you want.

get.prob.sample<-function(x,prob=0.5) {
  xlevels<-levels(as.factor(x))
  xlength<-length(x)
  xsamp<-rep(FALSE,xlength)
  for(i in xlevels) {
   lengthi<-length(x[x == i])
   xsamp[sample(which(x == i),lengthi*prob)]<-TRUE
  }
  return(xsamp)
}

get.prob.sample(mydata$aa,0.75)

Jim