Hi all,
I'm looking for some help to bias the sample function. Basically, I'd like
to generate a data frame where the first column is completely random, the
second, however, is conditional do the first, the third is conditional to
the first and the second and so on. By conditional I mean that I shouldn't
have repeated values in the line. I know it could be easily implemented
using permutation, but it is not the case here. I need at least five
columns. Any idea to achieve what do I need?
set.seed(51)
data <- data.frame(
id=as.factor(1:100),
a=as.factor(sample(1:10, size=100, replace=TRUE)),
b=as.factor(sample(1:10, size=100, replace=TRUE)),
c=as.factor(sample(1:10, size=100, replace=TRUE)),
d=as.factor(sample(1:10, size=100, replace=TRUE)),
e=as.factor(sample(1:10, size=100, replace=TRUE))
)
biasing conditional sample
5 messages · Rui Barradas, arun, David L Carlson +1 more
Hello,
The function that follows returns a matrix, not a data.frame but does
what you ask for.
fun <- function(x, y, n){
f <- function(x, y){
while(TRUE){
rnd <- sample(x, 1)
if(!any(rnd %in% y)) break
}
rnd
}
for(i in seq_len(n)){
tmp <- apply(y, 1, function(.y) f(x, .y))
y <- cbind(y, tmp)
}
y
}
a <- cbind(sample(1:10, 100, TRUE)) # must have dims
fun(1:10, a, 4) # returns 5 columns, 'a' plus 4
Hope this helps,
Rui Barradas
Em 11-11-2012 19:06, dms at riseup.net escreveu:
Hi all,
I'm looking for some help to bias the sample function. Basically, I'd like
to generate a data frame where the first column is completely random, the
second, however, is conditional do the first, the third is conditional to
the first and the second and so on. By conditional I mean that I shouldn't
have repeated values in the line. I know it could be easily implemented
using permutation, but it is not the case here. I need at least five
columns. Any idea to achieve what do I need?
set.seed(51)
data <- data.frame(
id=as.factor(1:100),
a=as.factor(sample(1:10, size=100, replace=TRUE)),
b=as.factor(sample(1:10, size=100, replace=TRUE)),
c=as.factor(sample(1:10, size=100, replace=TRUE)),
d=as.factor(sample(1:10, size=100, replace=TRUE)),
e=as.factor(sample(1:10, size=100, replace=TRUE))
)
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi, If the question is to remove the duplicates/repeated in each row from the example "data", then dat2<-data[apply(data,1,function(x) all(!duplicated(x)|duplicated(x,fromLast=TRUE))),] head(dat2) #?? id a b? c d? e #6?? 6 9 5 10 1? 7 #8?? 8 5 2? 6 7? 4 #11 11 6 4? 9 8? 5 #12 12 7 1? 8 9 10 #15 15 1 9? 8 4? 7 #16 16 6 1? 3 7 10 A.K. ----- Original Message ----- From: "dms at riseup.net" <dms at riseup.net> To: r-help at r-project.org Cc: Sent: Sunday, November 11, 2012 2:06 PM Subject: [R] biasing conditional sample Hi all, I'm looking for some help to bias the sample function. Basically, I'd like to generate a data frame where the first column is completely random, the second, however, is conditional do the first, the third is conditional to the first and the second and so on. By conditional I mean that I shouldn't have repeated values in the line. I know it could be easily implemented using permutation, but it is not the case here. I need at least five columns. Any idea to achieve what do I need? set.seed(51) data <- data.frame( ? ? id=as.factor(1:100), ? ? a=as.factor(sample(1:10, size=100, replace=TRUE)), ? ? b=as.factor(sample(1:10, size=100, replace=TRUE)), ? ? c=as.factor(sample(1:10, size=100, replace=TRUE)), ? ? d=as.factor(sample(1:10, size=100, replace=TRUE)), ? ? e=as.factor(sample(1:10, size=100, replace=TRUE)) ) ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Can't you just use sample() on each row without replacement to guarantee no matches among the five (or more) columns? set.seed(51) Data <- sapply(1:100, function(x) sample(1:10, size=5)) Data <- data.frame(t(Data)) names(Data) <- letters[1:5] ---------------------------------------------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77843-4352
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
project.org] On Behalf Of Rui Barradas
Sent: Sunday, November 11, 2012 4:36 PM
To: dms at riseup.net
Cc: r-help at r-project.org
Subject: Re: [R] biasing conditional sample
Hello,
The function that follows returns a matrix, not a data.frame but does
what you ask for.
fun <- function(x, y, n){
f <- function(x, y){
while(TRUE){
rnd <- sample(x, 1)
if(!any(rnd %in% y)) break
}
rnd
}
for(i in seq_len(n)){
tmp <- apply(y, 1, function(.y) f(x, .y))
y <- cbind(y, tmp)
}
y
}
a <- cbind(sample(1:10, 100, TRUE)) # must have dims
fun(1:10, a, 4) # returns 5 columns, 'a' plus 4
Hope this helps,
Rui Barradas
Em 11-11-2012 19:06, dms at riseup.net escreveu:
Hi all, I'm looking for some help to bias the sample function. Basically, I'd
like
to generate a data frame where the first column is completely random,
the
second, however, is conditional do the first, the third is
conditional to
the first and the second and so on. By conditional I mean that I
shouldn't
have repeated values in the line. I know it could be easily
implemented
using permutation, but it is not the case here. I need at least five
columns. Any idea to achieve what do I need?
set.seed(51)
data <- data.frame(
id=as.factor(1:100),
a=as.factor(sample(1:10, size=100, replace=TRUE)),
b=as.factor(sample(1:10, size=100, replace=TRUE)),
c=as.factor(sample(1:10, size=100, replace=TRUE)),
d=as.factor(sample(1:10, size=100, replace=TRUE)),
e=as.factor(sample(1:10, size=100, replace=TRUE))
)
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
Thanks for the solutions. Carlson's and Barradas's approaches give me what I need. Nonetheless, Carlson's proposal is slightly better for my purposes because it's shorter. Thanks Daniel
Can't you just use sample() on each row without replacement to guarantee no matches among the five (or more) columns? set.seed(51) Data <- sapply(1:100, function(x) sample(1:10, size=5)) Data <- data.frame(t(Data)) names(Data) <- letters[1:5] ---------------------------------------------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77843-4352
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
project.org] On Behalf Of Rui Barradas
Sent: Sunday, November 11, 2012 4:36 PM
To: dms at riseup.net
Cc: r-help at r-project.org
Subject: Re: [R] biasing conditional sample
Hello,
The function that follows returns a matrix, not a data.frame but does
what you ask for.
fun <- function(x, y, n){
f <- function(x, y){
while(TRUE){
rnd <- sample(x, 1)
if(!any(rnd %in% y)) break
}
rnd
}
for(i in seq_len(n)){
tmp <- apply(y, 1, function(.y) f(x, .y))
y <- cbind(y, tmp)
}
y
}
a <- cbind(sample(1:10, 100, TRUE)) # must have dims
fun(1:10, a, 4) # returns 5 columns, 'a' plus 4
Hope this helps,
Rui Barradas
Em 11-11-2012 19:06, dms at riseup.net escreveu:
Hi all, I'm looking for some help to bias the sample function. Basically, I'd
like
to generate a data frame where the first column is completely random,
the
second, however, is conditional do the first, the third is
conditional to
the first and the second and so on. By conditional I mean that I
shouldn't
have repeated values in the line. I know it could be easily
implemented
using permutation, but it is not the case here. I need at least five
columns. Any idea to achieve what do I need?
set.seed(51)
data <- data.frame(
id=as.factor(1:100),
a=as.factor(sample(1:10, size=100, replace=TRUE)),
b=as.factor(sample(1:10, size=100, replace=TRUE)),
c=as.factor(sample(1:10, size=100, replace=TRUE)),
d=as.factor(sample(1:10, size=100, replace=TRUE)),
e=as.factor(sample(1:10, size=100, replace=TRUE))
)
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.