An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110123/f0909d5b/attachment.pl>
select a subset from a sample
3 messages · Wei Yang, Ista Zahn, Den
I think there are multiple solutions that match your criteria. Here is one:
dat <- structure(list(Id = 1:20, v1 = c(1L, 2L, 4L, 1L, 3L, 3L, 3L,
+ 4L, 1L, 4L, 2L, 1L, 2L, 4L, 3L, 2L, 1L, 2L, 4L, 3L), v2 = c(2L,
+ 1L, 2L, 1L, 2L, 1L, 4L, 4L, 2L, 1L, 4L, 4L, 3L, 3L, 2L, 3L, 4L,
+ 3L, 1L, 3L), v3 = c(4L, 3L, 4L, 2L, 3L, 1L, 3L, 4L, 2L, 1L, 3L,
+ 2L, 3L, 1L, 1L, 2L, 1L, 4L, 4L, 2L), v4 = c(3L, 4L, 2L, 3L, 4L,
+ 1L, 1L, 4L, 1L, 2L, NA, 3L, 4L, NA, 2L, 3L, 4L, 3L, 1L, 1L)), .Names
= c("Id",
+ "v1", "v2", "v3", "v4"), class = "data.frame", row.names = c(NA,
+ -20L))
keep <- rowSums(apply(dat[,-1], 2, function(x) !duplicated(x))) dat.sub <- dat[keep > 0 ,]
Best, Ista
On Sun, Jan 23, 2011 at 12:43 PM, Wei Yang <peterwyang1 at gmail.com> wrote:
Dear all, I would like to ask whether anyone has experience with the problem below. I want to select a subset of the sample (see data below) so that each level (1,2,3,4 in the example) for every variable (v1,v2,v3,v4 in the example) is shown at least once in the subset. ?I also want the sample size of the subset to be as small as possible. ?Any help on it is greatly appreciated. ? ?Id v1 v2 v3 v4 [1,] ?1 1 2 4 3 ?[2,] ?2 2 1 3 4 ?[3,] ?3 4 2 4 2 ?[4,] ?4 1 1 2 3 ?[5,] ?5 3 2 3 4 ?[6,] ?6 3 1 1 1 ?[7,] ?7 3 4 3 1 ?[8,] ?8 4 4 4 4 ?[9,] ?9 1 2 2 1 [10,] 10 4 1 1 2 [11,] 11 2 4 3 2 [12,] 12 1 4 2 3 [13,] 13 2 3 3 4 [14,] 14 4 3 1 2 [15,] 15 3 2 1 2 [16,] 16 2 3 2 3 [17,] 17 1 4 1 4 [18,] 18 2 3 4 3 [19,] 19 4 1 4 1 [20,] 20 3 3 2 1 Thanks, Peter ? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org
Maybe that: su <- lapply(dat[2:5],function(x)table(x)) su mode(su) myBYdata <- data.frame( do.call(cbind,lapply(su, as.data.frame)) ) myBYdata ? ???, 23/01/2011 ? 07:43 -0500, Wei Yang ????:
Dear all,
I would like to ask whether anyone has experience with the problem below.
I want to select a subset of the sample (see data below) so that each level
(1,2,3,4 in the example) for every variable (v1,v2,v3,v4 in the example) is
shown at least once in the subset. I also want the sample size of the
subset to be as small as possible. Any help on it is greatly appreciated.
Id v1 v2 v3 v4
[1,] 1 1 2 4 3
[2,] 2 2 1 3 4
[3,] 3 4 2 4 2
[4,] 4 1 1 2 3
[5,] 5 3 2 3 4
[6,] 6 3 1 1 1
[7,] 7 3 4 3 1
[8,] 8 4 4 4 4
[9,] 9 1 2 2 1
[10,] 10 4 1 1 2
[11,] 11 2 4 3 2
[12,] 12 1 4 2 3
[13,] 13 2 3 3 4
[14,] 14 4 3 1 2
[15,] 15 3 2 1 2
[16,] 16 2 3 2 3
[17,] 17 1 4 1 4
[18,] 18 2 3 4 3
[19,] 19 4 1 4 1
[20,] 20 3 3 2 1
Thanks,
Peter
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.