Skip to content
Prev 245365 / 398506 Next

Random selection from a subsample

On Dec 19, 2010, at 5:31 AM, Tom Wilding wrote:

            
Only 2? Your argument to sample is 10.
And those row numbers would not refer to the order in the original  
sample either but would be referring within the . You have not yet  
done a very good job of specifying what sampling strategy is needed.  
At the moment you seem to be working toward a strategy that would  
potentially be very uneven in terms of the probabilities that members  
of different combinations would get into the sample, since the number  
being chosen is fixed and the number to be chosen from "varies  
widely". Is that really what you want?
(You also have not provided a reproducible data example. Next time  
bring data.)

Theis works to sample 3 from each of the the distinct categories in  
the warpbreaks data object:

by(warpbreaks, list(warpbreaks$wool, warpbreaks$tension),  
FUN=function(x) x[sample(1:nrow(x), 3), ] )   #returns a list with 6  
members each of which has a three row dataframe

And this would stick them back together in on dataframe:

  do.call(rbind, by(warpbreaks, list(warpbreaks$wool, warpbreaks 
$tension), FUN=function(x) x[sample(1:nrow(x), 3), ] ) )