Wide to long form conversion
On Oct 7, 2011, at 1:30 PM, David Winsemius wrote:
On Oct 7, 2011, at 7:40 AM, Gang Chen wrote:
Jim, I really appreciate your help! I like the power of rep_n_stack, but how can I use rep_n_stack to get the following result? Subj Group value Ref Var Time 1 S1 s 4 Me F 1 2 S1 s 3 Me F 2 3 S1 s 5 Me J 1 4 S1 s 6 Me J 2 5 S1 s 6 She F 1 6 S1 s 6 She F 2 7 S1 s 10 She J 1 8 S1 s 9 She J 2
I was not able to construct a one step solution with `reshape` that
will contains all the columns. You can do it in about 4 steps by
first making the data "long" and then adding annotation columns.
Using just rows 1 and 26 you might get:
reshape(myData[c(1,26), ], idvar=c("Group","Subj"),
direction="long",
varying=2:9,
v.names=c("value") )
Group Subj time value
s.S1.1 s S1 1 4
w.S26.1 w S26 1 5
s.S1.2 s S1 2 5
w.S26.2 w S26 2 9
s.S1.3 s S1 3 6
w.S26.3 w S26 3 4
s.S1.4 s S1 4 10
w.S26.4 w S26 4 7
s.S1.5 s S1 5 3
w.S26.5 w S26 5 3
s.S1.6 s S1 6 6
w.S26.6 w S26 6 7
s.S1.7 s S1 7 6
w.S26.7 w S26 7 3
s.S1.8 s S1 8 9
w.S26.8 w S26 8 5
The 'time' variable is not really what you wanted but refers to the
sequence along the original wide column names
You can add the desired Ref, Var and Time columms with these
constructions:
str(times<-rep(c(1,2), length=nrow(myData)*8 ) )
num [1:408] 1 2 1 2 1 2 1 2 1 2 ...
str(times<-rep(c("F","J"), each=2, length=nrow(myData)*8 ) )
chr [1:408] "F" "F" "J" "J" "F" "F" "J" "J" "F" "F" ...
str(times<-rep(c("Me","She"), each=4, length=nrow(myData)*8 ) )
chr [1:408] "Me" "Me" "Me" "Me" "She" "She" "She" "She" ...
It occured to me that the ordering operation probably should have preceded teh ancillary column creation so this method is tested:
longData <- reshape(myData, idvar=c("Group","Subj"),
direction="long", #fixed the direction argument
varying=2:9,
v.names=c("value") )
longData <- longData[order(longData$Subj), ]
longData$Time <- rep(c(1,2), length=nrow(myData)*8 )
longData$Var <- rep(c("F","J"), each=2, length=nrow(myData)*8 )
longData$Ref <- rep(c("Me","She"), each=4, length=nrow(myData)*8 )
Group Subj time value Time Var Ref s.S1.1 s S1 1 4 1 F Me s.S1.2 s S1 2 5 2 F Me s.S1.3 s S1 3 6 1 J Me s.S1.4 s S1 4 10 2 J Me s.S1.5 s S1 5 3 1 F She s.S1.6 s S1 6 6 2 F She s.S1.7 s S1 7 6 1 J She s.S1.8 s S1 8 9 2 J She
Looking at Jim Lemon's response, I think he just misinterpreted the structure of your data but gave you a perfectly usable response. You could have done much the same thing with a minor modification:
str(rep_n_stack(myData,matrix(c(2,3,6,7,4,5,8,9),nrow=1,byrow=TRUE))) 'data.frame': 408 obs. of 4 variables: $ Group : Factor w/ 2 levels "s","w": 1 1 1 1 1 1 1 1 1 1 ... $ Subj : Factor w/ 51 levels "S1","S10","S11",..: 1 12 23 34 45 48 49 50 51 2 ... $ group1: Factor w/ 8 levels "Me.F.1","Me.F.2",..: 1 1 1 1 1 1 1 1 1 1 ... $ value1: int 4 6 7 8 10 5 13 8 6 14 ... Now you can just split apart the 'group1' column with sub() to make the three specified columns.
Lemon's method has the advantage that it properly carries along the column information
-- David.
On Fri, Oct 7, 2011 at 7:16 AM, Jim Lemon <jim at bitwrit.com.au> wrote:
On 10/07/2011 07:28 AM, Gang Chen wrote:
I have some data 'myData' in wide form (attached at the end), and would like to convert it to long form. I wish to have five variables in the result: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F: within-subject factor (2 levels: F1 / F2) 5) J: within-subject factor (2 levels: J1 / J2)
Hi Gang, I don't know whether this is the format you want, but: library(prettyR) rep_n_stack(mydata,matrix(c(2,3,6,7,4,5,8,9),nrow=2,byrow=TRUE)) Jim
David Winsemius, MD West Hartford, CT