Skip to content

Manipulation of data.frame into an array

7 messages · Ioanna Ioannou, Bert Gunter, Gerrit Eichner +2 more

#
Hello everyone,


 I want to transform a data.frame into an array (lets call it mydata), where: mydata[[1]] is the first imputed dataset...and for each mydata[[d]], the first p columns are covariates X, and the last one is the outcome Y.


Lets assume a simple data.frame:


Imputed = data.frame( X1 = c(1,2,1,2,1,2,1,2, 1,2,1,2,1,2,1,2),

                                          X2 = c(0,1,0,1,1,1,0,1, 0,1,0,1,1,1,0,1),

                                           Y   = c(1,2,3,4,5,6,7,8,1,2,3,4,5,6,7,8))

The first 8 have been obtained by the first imputation and the later 8 by the 2nd.


Can you help me please?


Best,

ioanna
#
Hello,

I am not sure I understand the question, but see if the following is 
what you want.

split(Imputed, cumsum(c(0, diff(Imputed$Y) != 1)))


Hope this helps,

Rui Barradas
On 5/24/2018 3:46 PM, Ioanna Ioannou wrote:
#
This is one of those instances where a less superficial knowledge of R's
technical details comes in really handy.

What you need to do is convert the data frame to a single (numeric) vector
for, e.g. a matrix() call. This can be easily done by noting that a data
frame is also a list and using do.call():

## imp is the data frame:

do.call(c,imp)

 X11  X12  X13  X14  X15  X16  X17  X18  X19 X110 X111 X112 X113 X114
   1    2    1    2    1    2    1    2    1    2    1    2    1    2
X115 X116  X21  X22  X23  X24  X25  X26  X27  X28  X29 X210 X211 X212
   1    2    0    1    0    1    1    1    0    1    0    1    0    1
X213 X214 X215 X216   Y1   Y2   Y3   Y4   Y5   Y6   Y7   Y8   Y9  Y10
   1    1    0    1    1    2    3    4    5    6    7    8    1    2
 Y11  Y12  Y13  Y14  Y15  Y16
   3    4    5    6    7    8

So, e.g. for a 3 column matrix:
[,1] [,2] [,3]
 [1,]    1    0    1
 [2,]    2    1    2
 [3,]    1    0    3
 [4,]    2    1    4
 [5,]    1    1    5
 [6,]    2    1    6
 [7,]    1    0    7
 [8,]    2    1    8
 [9,]    1    0    1
[10,]    2    1    2
[11,]    1    0    3
[12,]    2    1    4
[13,]    1    1    5
[14,]    2    1    6
[15,]    1    0    7
[16,]    2    1    8

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Thu, May 24, 2018 at 7:46 AM, Ioanna Ioannou <ii54250 at msn.com> wrote:

            

  
  
#
Why not use as.matrix(Imp) in this case?

  Regards  --  Gerrit

Am 24.05.2018 um 17:04 schrieb Bert Gunter:
#
Hello everyone,


Thank you for this. Nonetheless it is not exactly want i need.


I need mydata[[1]] to provide the values for all 3 variables (Y, X1 and X2) of the first imputation only. As it stands it returns the whole database.

Any ideas?


Best,

ioanna
#
It would help if you show exactly the structure of your desired result, using the simple example data you supplied (what, exactly, do you mean by "array"?)

If you want mydata[[1]] "to provide the values for all three 3 variables (Y, X1 and X2) of the first imputation only" then this will do it:
X1 X2 Y
1  1  0 1
2  2  1 2
3  1  0 3
4  2  1 4
5  1  1 5
6  2  1 6
7  1  0 7
8  2  1 8

But mydata is not an array, it's a list.

The "[[ ]]" syntax doesn't particularly make sense on an array object:
, , 1

     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

, , 2

     [,1] [,2] [,3]
[1,]    7    9   11
[2,]    8   10   12
[1] 1

But "[[ ]]" does make sense on a list object.
--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509
 
 

?On 5/24/18, 8:14 AM, "R-help on behalf of Ioanna Ioannou" <r-help-bounces at r-project.org on behalf of ii54250 at msn.com> wrote:

    Hello everyone,
    
    
    Thank you for this. Nonetheless it is not exactly want i need.
    
    
    I need mydata[[1]] to provide the values for all 3 variables (Y, X1 and X2) of the first imputation only. As it stands it returns the whole database.
    
    Any ideas?
    
    
    Best,
    
    ioanna
#
Hello,

I still don't understand, my code returns each imputation in a separate 
data.frame.

mydata <- split(Imputed, cumsum(c(0, diff(Imputed$Y) != 1)))
mydata[[1]]
#  X1 X2 Y
#1  1  0 1
#2  2  1 2
#3  1  0 3
#4  2  1 4
#5  1  1 5
#6  2  1 6
#7  1  0 7
#8  2  1 8


And mydata[[2]] will be the other imputation.

If this is not what you want, can you please post an example output 
mydata[[1]] from the database you have posted?

Rui Barradas
On 5/24/2018 4:14 PM, Ioanna Ioannou wrote: