Dear all, let say I have following data:
dat <- structure(list(V1 = structure(c(1L, 4L, 5L, 3L, 3L, 5L, 6L, 6L,
4L, 3L, 5L, 6L, 5L, 5L, 4L, 4L, 6L, 2L, 3L, 4L, 3L, 3L, 2L, 5L,
3L, 6L, 3L, 3L, 6L, 3L, 6L, 1L, 6L, 5L, 2L, 2L), .Label = c("C",
"G", "I", "O", "R", "T"), class = "factor"), V2 = c(0L, 0L, 0L,
1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L,
1L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L,
0L), V3 = c(1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 0L,
0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L,
0L, 1L, 0L, 1L, 0L, 1L, 1L)), .Names = c("V1", "V2", "V3"), class =
"data.frame", row.names = c(NA,
-36L))
Now I want to get following kind of data frame out of that:
dat1 <- structure(list(V1 = structure(c(3L, 3L, 1L, 1L, 2L, 2L), .Label
= c("C",
"G", "I"), class = "factor"), V2 = c(0L, 1L, 0L, 1L, 0L, 1L),
V3 = c(0.333333333, 0.428571429, 0.5, NA, 1, NA)), .Names = c("V1",
"V2", "V3"), class = "data.frame", row.names = c(NA, -6L))
Basically in 'dat1', the 3rd column is coming from: for 'V1 = I' & 'V2 =
0' what is the percentage of '1' for "V3" and so on.....
Is there any R function to achieve that directly?
Thanks and regards,
Can somebody help me with following data manipulation?
5 messages · Christofer Bogaso, Sarah Goslee, Thomas Stewart +2 more
If I understand what you want correctly, aggregate() should do it.
aggregate(V3 ~ V1 + V2, "mean", data=dat)
V1 V2 V3 1 C 0 0.5000000 2 G 0 1.0000000 3 I 0 0.3333333 4 O 0 1.0000000 5 R 0 0.0000000 6 T 0 0.8333333 7 I 1 0.4285714 8 O 1 0.0000000 9 R 1 0.6666667 10 T 1 0.5000000 That returns the combinations that actually exist. If you convert V1 and V2 to factors, thus setting the possible levels, all combinations will be returned:
dat$V1 <- factor(dat$V1) dat$V2 <- factor(dat$V2) aggregate(V3 ~ V1 + V2, "mean", data=dat)
V1 V2 V3 1 C 0 0.5000000 2 G 0 1.0000000 3 I 0 0.3333333 4 O 0 1.0000000 5 R 0 0.0000000 6 T 0 0.8333333 7 I 1 0.4285714 8 O 1 0.0000000 9 R 1 0.6666667 10 T 1 0.5000000 Sarah On Thu, Dec 6, 2012 at 2:35 PM, Christofer Bogaso
<bogaso.christofer at gmail.com> wrote:
Dear all, let say I have following data:
dat <- structure(list(V1 = structure(c(1L, 4L, 5L, 3L, 3L, 5L, 6L, 6L,
4L, 3L, 5L, 6L, 5L, 5L, 4L, 4L, 6L, 2L, 3L, 4L, 3L, 3L, 2L, 5L,
3L, 6L, 3L, 3L, 6L, 3L, 6L, 1L, 6L, 5L, 2L, 2L), .Label = c("C",
"G", "I", "O", "R", "T"), class = "factor"), V2 = c(0L, 0L, 0L,
1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L,
1L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L,
0L), V3 = c(1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 0L,
0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L,
0L, 1L, 0L, 1L, 0L, 1L, 1L)), .Names = c("V1", "V2", "V3"), class =
"data.frame", row.names = c(NA,
-36L))
Now I want to get following kind of data frame out of that:
dat1 <- structure(list(V1 = structure(c(3L, 3L, 1L, 1L, 2L, 2L), .Label =
c("C",
"G", "I"), class = "factor"), V2 = c(0L, 1L, 0L, 1L, 0L, 1L),
V3 = c(0.333333333, 0.428571429, 0.5, NA, 1, NA)), .Names = c("V1",
"V2", "V3"), class = "data.frame", row.names = c(NA, -6L))
Basically in 'dat1', the 3rd column is coming from: for 'V1 = I' & 'V2 = 0'
what is the percentage of '1' for "V3" and so on.....
Is there any R function to achieve that directly?
Thanks and regards,
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20121206/4db2606c/attachment.pl>
Converting to factors does not get all combinations.
v3mean <- aggregate(V3~V1+V2, dat, mean) cats <- with(dat, expand.grid(V1=unique(V1), V2=unique(V2))) merge(cats, v3mean, all=TRUE)
V1 V2 V3 1 C 0 0.5000000 2 C 1 NA 3 G 0 1.0000000 4 G 1 NA 5 I 0 0.3333333 6 I 1 0.4285714 7 O 0 1.0000000 8 O 1 0.0000000 9 R 0 0.0000000 10 R 1 0.6666667 11 T 0 0.8333333 12 T 1 0.5000000 But the OP's dat1 contains only 6 observations. ---------------------------------------------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77843-4352
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- project.org] On Behalf Of Sarah Goslee Sent: Thursday, December 06, 2012 2:04 PM To: Christofer Bogaso Cc: r-help Subject: Re: [R] Can somebody help me with following data manipulation? If I understand what you want correctly, aggregate() should do it.
aggregate(V3 ~ V1 + V2, "mean", data=dat)
V1 V2 V3 1 C 0 0.5000000 2 G 0 1.0000000 3 I 0 0.3333333 4 O 0 1.0000000 5 R 0 0.0000000 6 T 0 0.8333333 7 I 1 0.4285714 8 O 1 0.0000000 9 R 1 0.6666667 10 T 1 0.5000000 That returns the combinations that actually exist. If you convert V1 and V2 to factors, thus setting the possible levels, all combinations will be returned:
dat$V1 <- factor(dat$V1) dat$V2 <- factor(dat$V2) aggregate(V3 ~ V1 + V2, "mean", data=dat)
V1 V2 V3 1 C 0 0.5000000 2 G 0 1.0000000 3 I 0 0.3333333 4 O 0 1.0000000 5 R 0 0.0000000 6 T 0 0.8333333 7 I 1 0.4285714 8 O 1 0.0000000 9 R 1 0.6666667 10 T 1 0.5000000 Sarah On Thu, Dec 6, 2012 at 2:35 PM, Christofer Bogaso <bogaso.christofer at gmail.com> wrote:
Dear all, let say I have following data: dat <- structure(list(V1 = structure(c(1L, 4L, 5L, 3L, 3L, 5L, 6L,
6L,
4L, 3L, 5L, 6L, 5L, 5L, 4L, 4L, 6L, 2L, 3L, 4L, 3L, 3L, 2L, 5L,
3L, 6L, 3L, 3L, 6L, 3L, 6L, 1L, 6L, 5L, 2L, 2L), .Label = c("C",
"G", "I", "O", "R", "T"), class = "factor"), V2 = c(0L, 0L, 0L,
1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L,
1L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L,
0L), V3 = c(1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 0L,
0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L,
0L, 1L, 0L, 1L, 0L, 1L, 1L)), .Names = c("V1", "V2", "V3"), class =
"data.frame", row.names = c(NA,
-36L))
Now I want to get following kind of data frame out of that:
dat1 <- structure(list(V1 = structure(c(3L, 3L, 1L, 1L, 2L, 2L),
.Label =
c("C",
"G", "I"), class = "factor"), V2 = c(0L, 1L, 0L, 1L, 0L, 1L),
V3 = c(0.333333333, 0.428571429, 0.5, NA, 1, NA)), .Names =
c("V1",
"V2", "V3"), class = "data.frame", row.names = c(NA, -6L)) Basically in 'dat1', the 3rd column is coming from: for 'V1 = I' &
'V2 = 0'
what is the percentage of '1' for "V3" and so on..... Is there any R function to achieve that directly? Thanks and regards,
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
Hi, You can also use library(plyr) ddply(dat,.(V1,V2),summarise,V3=mean(V3),.drop=FALSE) #?? V1 V2??????? V3 #1?? C? 0 0.5000000 #2?? C? 1?????? NaN #3?? G? 0 1.0000000 #4?? G? 1?????? NaN #5?? I? 0 0.3333333 #6?? I? 1 0.4285714 #7?? O? 0 1.0000000 #8 ? O? 1 0.0000000 #9?? R? 0 0.0000000 #10? R? 1 0.6666667 #11? T? 0 0.8333333 #12? T? 1 0.5000000 A.K. ----- Original Message ----- From: Thomas Stewart <tgs.public.mail at gmail.com> To: Cc: r-help <r-help at r-project.org> Sent: Thursday, December 6, 2012 3:17 PM Subject: Re: [R] Can somebody help me with following data manipulation? You can directly use the tapply function. -tgs tapply(dat[,3],dat[,-3],mean)
On Thu, Dec 6, 2012 at 3:03 PM, Sarah Goslee <sarah.goslee at gmail.com> wrote:
If I understand what you want correctly, aggregate() should do it.
aggregate(V3 ~ V1 + V2, "mean", data=dat)
? ? V1 V2? ? ? ? V3 1? C? 0 0.5000000 2? G? 0 1.0000000 3? I? 0 0.3333333 4? O? 0 1.0000000 5? R? 0 0.0000000 6? T? 0 0.8333333 7? I? 1 0.4285714 8? O? 1 0.0000000 9? R? 1 0.6666667 10? T? 1 0.5000000 That returns the combinations that actually exist. If you convert V1 and V2 to factors, thus setting the possible levels, all combinations will be returned:
dat$V1 <- factor(dat$V1) dat$V2 <- factor(dat$V2) aggregate(V3 ~ V1 + V2, "mean", data=dat)
? ? V1 V2? ? ? ? V3 1? C? 0 0.5000000 2? G? 0 1.0000000 3? I? 0 0.3333333 4? O? 0 1.0000000 5? R? 0 0.0000000 6? T? 0 0.8333333 7? I? 1 0.4285714 8? O? 1 0.0000000 9? R? 1 0.6666667 10? T? 1 0.5000000 Sarah On Thu, Dec 6, 2012 at 2:35 PM, Christofer Bogaso <bogaso.christofer at gmail.com> wrote:
Dear all, let say I have following data:
dat <- structure(list(V1 = structure(c(1L, 4L, 5L, 3L, 3L, 5L, 6L, 6L,
4L, 3L, 5L, 6L, 5L, 5L, 4L, 4L, 6L, 2L, 3L, 4L, 3L, 3L, 2L, 5L,
3L, 6L, 3L, 3L, 6L, 3L, 6L, 1L, 6L, 5L, 2L, 2L), .Label = c("C",
"G", "I", "O", "R", "T"), class = "factor"), V2 = c(0L, 0L, 0L,
1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L,
1L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L,
0L), V3 = c(1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 0L,
0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L,
0L, 1L, 0L, 1L, 0L, 1L, 1L)), .Names = c("V1", "V2", "V3"), class =
"data.frame", row.names = c(NA,
-36L))
Now I want to get following kind of data frame out of that:
dat1 <- structure(list(V1 = structure(c(3L, 3L, 1L, 1L, 2L, 2L), .Label =
c("C",
"G", "I"), class = "factor"), V2 = c(0L, 1L, 0L, 1L, 0L, 1L),
? ? V3 = c(0.333333333, 0.428571429, 0.5, NA, 1, NA)), .Names = c("V1",
"V2", "V3"), class = "data.frame", row.names = c(NA, -6L))
Basically in 'dat1', the 3rd column is coming from: for 'V1 = I' & 'V2 =
0'
what is the percentage of '1' for "V3" and so on..... Is there any R function to achieve that directly? Thanks and regards,
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.