Skip to content

trouble converting an array to a dataframe

4 messages · Hadley Wickham, Christopher W. Ryan

#
I start with a dataframe called xrays. It contains scores on films from
each of two radiologists. It is in "long" format. I used the reshape
package to melt a data frame and then cast it into "wide" format, one
line for each patient (identified by redlognumb) with scores from both
radiologists for a given patient on the same line.

I named the result of the casting xrays.data. It is an array. I'd like
it to be a three-variable dataframe, with one column for scores from
each of two radiologists, and one column for redlognumb (because I will
then need to merge it with another dataframe that has a column named
redlognumb.) As you can see below, the data.frame() function turns
xrays.data into a two-variable dataframe. How can I get three columns
(or variables) into my final dataframe?

Thanks.
redlognumb radiologis barrtotal
1          3          2        13
2          4          2        16
3          5          2        10
4          6          2        11
5          9          2        NA
6         10          2        NA
id=c("redlognumb","radiologis"))))

  redlognumb radiologis  variable value
1          3          2 barrtotal    13
2          4          2 barrtotal    16
3          5          2 barrtotal    10
4          6          2 barrtotal    11
7          1          1 barrtotal    11
8          2          1 barrtotal     2
radiologis
redlognumb  1  2
        1  11 NA
        2   2 NA
        3  12 13
        4  16 16
        5  12 10
       . .  . cut off for brevity . . .
int [1:42, 1:2, 1] 11 2 12 16 12 13 18 8 19 14 ...
 - attr(*, "dimnames")=List of 3
  ..$ redlognumb: Named chr [1:42] "1" "2" "3" "4" ...
  .. ..- attr(*, "names")= chr [1:42] "1" "2" "3" "5" ...
  ..$ radiologis: Named chr [1:2] "1" "2"
  .. ..- attr(*, "names")= chr [1:2] "1" "80"
  ..$ variable  : Named chr "barrtotal"
  .. ..- attr(*, "names")= chr "1"
X1.barrtotal X2.barrtotal
1            11           NA
2             2           NA
3            12           13
4            16           16
5            12           10
6            13           11
7            18           NA
8             8           NA
. . . . cut off for brevity . . .
#
On Tue, Jan 20, 2009 at 11:10 PM, Christopher W. Ryan
<cryan at binghamton.edu> wrote:
It sounds to me like this final data frame would just be equivalent to
your initial unmolten data.  What's the difference?

Hadley

PS. The melt function takes an na.rm argument that will remove any
missing values.

  
    
#
I probably did not explain my data clearly. I am starting with a
dataframe with three columns:

redlognumb     radiologist    barrtotal

where the entries in the variable radiologist are either 1 or 2,
indicating which radiologist generated that barrtotal. All subjects had
their X-ray read independently by both radiologists. So there are two
rows for each subject.

I want to convert it to this structure:

redlognumb    radiologist.1.barrtotal    radiologist.2.barrtotal

in which there is only one row for each subject.

At any rate, in the meantime, I think I figured out that I was "melting"
improperly, and I think I've got it now.  Thanks.

--Chris
Christopher W. Ryan, MD
SUNY Upstate Medical University Clinical Campus at Binghamton
40 Arch Street, Johnson City, NY  13790
cryanatbinghamtondotedu
PGP public keys available at http://home.stny.rr.com/ryancw/

"If you want to build a ship, don't drum up the men to gather wood,
divide the work and give orders. Instead, teach them to yearn for the
vast and endless sea."  [Antoine de St. Exupery]
hadley wickham wrote:
#
On Thu, Jan 22, 2009 at 9:09 AM, Christopher W. Ryan
<cryan at binghamton.edu> wrote:
You should just be able to cast like:

cast(m, redlognumb ~ radiologist + variable)

If you haven't already, you might want to look at the introduction
available at http://had.co.nz/reshape

Regards,

Hadley