Skip to content

How to reshape wide format data.frame to long format?

6 messages · Fredrik Karlsson, Abhijit Dasgupta, David Winsemius +2 more

#
Dear list,

I need to convert this data.frame
[1] "key"        "AMR.pa1.M"  "AMR.pa2.M"  "AMR.pa3.M"  "AMR.pa4.M"
 [6] "AMR.pa5.M"  "AMR.pa6.M"  "AMR.pa7.M"  "AMR.pa8.M"  "AMR.pa9.M"
[11] "AMR.pa10.M" "AMR.ta1.M"  "AMR.ta2.M"  "AMR.ta3.M"  "AMR.ta4.M"
[16] "AMR.ta5.M"  "AMR.ta6.M"  "AMR.ta7.M"  "AMR.ta8.M"  "AMR.ta9.M"
[21] "AMR.ta10.M" "AMR.ka1.M"  "AMR.ka2.M"  "AMR.ka3.M"  "AMR.ka4.M"
[26] "AMR.ka5.M"  "AMR.ka6.M"  "AMR.ka7.M"  "AMR.ka8.M"  "AMR.ka9.M"
[31] "AMR.ka10.M" "SMR.pa1.M"  "SMR.pa2.M"  "SMR.pa3.M"  "SMR.pa4.M"
[36] "SMR.pa5.M"  "SMR.pa6.M"  "SMR.pa7.M"  "SMR.pa8.M"  "SMR.pa9.M"
[41] "SMR.pa10.M" "SMR.ta1.M"  "SMR.ta2.M"  "SMR.ta3.M"  "SMR.ta4.M"
[46] "SMR.ta5.M"  "SMR.ta6.M"  "SMR.ta7.M"  "SMR.ta8.M"  "SMR.ta9.M"
[51] "SMR.ta10.M" "SMR.ka1.M"  "SMR.ka2.M"  "SMR.ka3.M"  "SMR.ka4.M"
[56] "SMR.ka5.M"  "SMR.ka6.M"  "SMR.ka7.M"  "SMR.ka8.M"  "SMR.ka9.M"
[61] "SMR.ka10.M"
[1] 42 61

into a 3 x  2501 data.frame where the "key" variable is kept, the
values in columns 2-61 above is inserted into a "values" column and
the name of the column is inserted in a third column ("variable"
perhaps).

Like

key                         variable              value
POSTOFF_1_1    AMR.pa1.M       5
POSTOFF_1_1    AMR.pa2.M       3
....

I think I should be able to do this using the "reshape" function, but
I cannot get it to work. I think I need some help to understand
this...


(If I could split the "variable" into three separate columns splitting
by ".", that would be even better.)

I appreciate all the help I could get.

/Fredrik
#
I would think that the following code should work:

newcodesM = reshape(codesM, id=1)

If other variables in the data.frame are factors, reshape thinks all of them are ID variables and tries to use all of them as "keys". Specifying the id variable you want to keep (I used id=1 since "key" is in the 1st column) will probably solve the issue. 

Abhijit
On Jan 20, 2011, at 10:51 AM, Fredrik Karlsson wrote:

            
#
As for your second question, you could certainly do

newcodesM = transform(newcodesM, variable1 = unlist(strsplit(variable,'\\.'))[1], variable2 = unlist(strsplit(variable, '\\.'))[2], variable3 = unlist(strsplit(variable,'\\.'))[3])

though I'm sure there is a more efficient use of strsplit in this context. 

Abhijit
On Jan 20, 2011, at 10:51 AM, Fredrik Karlsson wrote:

            
#
On Jan 20, 2011, at 10:51 AM, Fredrik Karlsson wrote:

            
I don't "see" anything special about this. If there is an unusual  
aspect to it, then you should post a simpler full example.  What  
happens when you try:

library(reshape2)

longCodes <- melt(codesM)

(Then you can fiddle with the names() of the new dataframe.)
Use strsplit and "["

If that's not clear, then post results of dput( codesM[1:5, 1:10] ) to  
provide a reproducible example.

  
    
#
Or colsplit, from reshape, that does this for you.

Hadley