Skip to content

biasing conditional sample

5 messages · Rui Barradas, arun, David L Carlson +1 more

#
Hi all,

I'm looking for some help to bias the sample function. Basically, I'd like
to generate a data frame where the first column is completely random, the
second, however, is conditional do the first, the third is conditional to
the first and the second and so on. By conditional I mean that I shouldn't
have repeated values in the line. I know it could be easily implemented
using permutation, but it is not the case here. I need at least five
columns. Any idea to achieve what do I need?


set.seed(51)
 data <- data.frame(
     id=as.factor(1:100),
     a=as.factor(sample(1:10, size=100, replace=TRUE)),
     b=as.factor(sample(1:10, size=100, replace=TRUE)),
     c=as.factor(sample(1:10, size=100, replace=TRUE)),
     d=as.factor(sample(1:10, size=100, replace=TRUE)),
     e=as.factor(sample(1:10, size=100, replace=TRUE))
)
#
Hello,

The function that follows returns a matrix, not a data.frame but does 
what you ask for.


fun <- function(x, y, n){
     f <- function(x, y){
         while(TRUE){
             rnd <- sample(x, 1)
             if(!any(rnd %in% y)) break
         }
         rnd
     }
     for(i in seq_len(n)){
         tmp <- apply(y, 1, function(.y) f(x, .y))
         y <- cbind(y, tmp)
     }
     y
}


a <- cbind(sample(1:10, 100, TRUE)) # must have dims
fun(1:10, a, 4)  # returns 5 columns, 'a' plus 4


Hope this helps,

Rui Barradas
Em 11-11-2012 19:06, dms at riseup.net escreveu:
#
Hi,

If the question is to remove the duplicates/repeated in each row from the example "data", then

dat2<-data[apply(data,1,function(x) all(!duplicated(x)|duplicated(x,fromLast=TRUE))),]
head(dat2)
#?? id a b? c d? e
#6?? 6 9 5 10 1? 7
#8?? 8 5 2? 6 7? 4
#11 11 6 4? 9 8? 5
#12 12 7 1? 8 9 10
#15 15 1 9? 8 4? 7
#16 16 6 1? 3 7 10

A.K.




----- Original Message -----
From: "dms at riseup.net" <dms at riseup.net>
To: r-help at r-project.org
Cc: 
Sent: Sunday, November 11, 2012 2:06 PM
Subject: [R] biasing conditional sample

Hi all,

I'm looking for some help to bias the sample function. Basically, I'd like
to generate a data frame where the first column is completely random, the
second, however, is conditional do the first, the third is conditional to
the first and the second and so on. By conditional I mean that I shouldn't
have repeated values in the line. I know it could be easily implemented
using permutation, but it is not the case here. I need at least five
columns. Any idea to achieve what do I need?


set.seed(51)
data <- data.frame(
? ?  id=as.factor(1:100),
? ?  a=as.factor(sample(1:10, size=100, replace=TRUE)),
? ?  b=as.factor(sample(1:10, size=100, replace=TRUE)),
? ?  c=as.factor(sample(1:10, size=100, replace=TRUE)),
? ?  d=as.factor(sample(1:10, size=100, replace=TRUE)),
? ?  e=as.factor(sample(1:10, size=100, replace=TRUE))
)

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Can't you just use sample() on each row without replacement to guarantee no
matches among the five (or more) columns?

set.seed(51)
Data <- sapply(1:100, function(x) sample(1:10, size=5))
Data <- data.frame(t(Data))
names(Data) <- letters[1:5]

----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352
#
Thanks for the solutions. Carlson's and Barradas's approaches give me what
I need. Nonetheless, Carlson's proposal is slightly better for my purposes
because it's shorter.

Thanks
Daniel