How to Un-group a grouped data set?
Don't use subset for a function name -- it's already the name of a
rather important function as is data (but at least that one's not a
function in your use so it's not quite so bad). Finally, use dput()
when sending data so we get a plaintext reproducible version.
I'd try something like this:
dats <- structure(list(Study = c(1L, 1L, 2L, 2L, 3L, 3L), TX = c(1L,
0L, 1L, 0L, 1L, 0L), AEs = c(3L, 2L, 1L, 2L, 1L, 1L), N = c(5L,
7L, 10L, 7L, 8L, 4L)), .Names = c("Study", "TX", "AEs", "N"), class =
"data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))
# See how handy dput can be :-)
dats[unlist(mapply(FUN = function(x,y) rep(x, y), 1:NROW(dats), dats$N)), -4]
which isn't super elegant, but others might have something better.
Best,
Michael
On Tue, May 15, 2012 at 1:24 AM, Cheenghee AM Koh <sigontw at gmail.com> wrote:
Hello, R-fellows, I have a question that I really don't know how to solve. I have spent hours on line surfing for possible solutions but in veil. Please if anyone could help me handle this issue, you would be so appreciated! I have a "grouped" dataset like this:
data
?Study TX AEs ? N
1 ? ? 1 ? ? 1 ? ?3 ? ? ? 5
2 ? ? 1 ? ? 0 ? ?2 ? ? ? 7
3 ? ? 2 ? ? 1 ? ?1 ? ? ?10
4 ? ? 2 ? ? 0 ? ?2 ? ? ? 7
5 ? ? 3 ? ? 1 ? ?1 ? ? ? 8
6 ? ? 3 ? ? 0 ? ?1 ? ? ? 4
where Study is the study id, TX is treatment, AEs is how many people in
this trial is positive, and N is the number of the subjects. Therefore, for
the row 1, it stands for: It is the treatment arm for the study one, where
there are 5 subjects and 3 of them are positive. The row 2 stands for: It
is the control arm of the study 1 where there are 7 subjects and 2 of them
are positive.
Now I would like to "un-group them", make it like:
Study ?TX ? AEs
? 1 ? ? ? ? 1 ? ? ?1
? 1 ? ? ? ? 1 ? ? ?1
? 1 ? ? ? ? 1 ? ? ?1
? 1 ? ? ? ? 1 ? ? ?0
? 1 ? ? ? ? 1 ? ? ?0
? 1 ? ? ? ? 0 ? ? ?1
? 1 ? ? ? ? 0 ? ? ?1
? 1 ? ? ? ? 0 ? ? ?0
? 1 ? ? ? ? 0 ? ? ?0
? 1 ? ? ? ? 0 ? ? ?0
? 1 ? ? ? ? 0 ? ? ?0
? 1 ? ? ? ? 0 ? ? ?0
? 2 ? ? ? ? 1 ? ? ?1
? .....................
?.....................
But I wasn't able to do it. In fact I wrote a small function, and use
"lapply" to get what I want. It worked well, and did give me what I want.
But I wasn't able to collapse all the returns into one single data frame
for subsequent analysis.
The function I wrote:
subset = function(i){
d = c(rep(data[i,1], data[i,4]), rep(data[i,2], data[i,4]), rep(0:1,
c(data[i,4] - data[i,3],data[i,3])))
d = matrix(d, data[i,4],3)
d
}
then:
Data = lapply(1:6, subset)
Data
Therefore, I tried to write a loop. But no matter how I tried, I can't get
what I want.
Any idea?
Thank you so much!
Best,
--
Cheenghee Masaki Koh, MSW, MS(c), PhD Student
School of Social Service Administration
Department of Health Studies, Division of Biological Science
University of Chicago
? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.