Skip to content

how to randomly select the samples with different probabilities for different classes?

3 messages · Marna Wagley, Rui Barradas, Jim Lemon

#
Hi R user,
I have samples with covariates for different classes, I wanted to choose
the samples of different groups with different probabilities. For example,
I have a 22 samples size with 3 classes,
groupA has 8 samples
groupB has 8 samples
groupC has 6 samples

I want to select a total 14 samples from 22 samples, in which  40% of the
14 samples should be in groups A and B, 60% of the 14 samples should be in
the group C.

Would you mind to help me on how I can select the samples with that
conditions? I have attached a sample data

dat<-structure(list(sampleID = c(17L, 21L, 36L, 45L, 67L, 82L, 90L,
31L, 70L, 45L, 24L, 80L, 82L, 45L, 85L, 14L, 81L, 96L, 61L, 12L,
65L, 88L), group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A",
"B", "C"), class = "factor")), .Names = c("sampleID", "group"
), class = "data.frame", row.names = c(NA, -22L))

thanks,
  MW
#
Hello,

If 60% of the 14 samples come from group C, then 8.4 samples should come 
from a group with 6 elements. Do you want sampling with replacement? If 
so maybe the following will do.


perc <- c(0.4, 0.6)
tmp <- split(seq_len(nrow(dat)), dat$group == "C")
idx <- sapply(seq_along(tmp), function(i) sample(length(tmp[[i]]), 
round(perc[i]*14), replace = TRUE))
idx[[2]] <- idx[[2]] + 16
idx <- unlist(idx)
dat[idx, ]

Hope this helps,

Rui Barradas

Em 07-12-2016 11:58, Marna Wagley escreveu:
#
Hi Marna,
If we assume a sample size of 1, something like this:

dat[sample(which(dat$group!="C"),ceiling(14*0.4),TRUE),]
dat[sample(which(dat$group=="C"),floor(14*0.6),TRUE),]

Then just step through the two subsets to access your samples.

One problem is that you will not get exactly 40 or 60 %, which is why
I had to put the "ceiling " and "floor" functions to work. Also, you
will have to sample with replacement as you will exhaust the "C"
group.

Jim
On Wed, Dec 7, 2016 at 10:58 PM, Marna Wagley <marna.wagley at gmail.com> wrote: