Skip to content

struggling with "split" function

4 messages · Dimitri Liakhovitski, Dimitris Rizopoulos

#
I am very sorry for such a simple question, but I am struggling with "split".
I have the following data frame:
x<-data.frame(A=c(NA,NA,NA,NA,"split",NA,NA,NA,NA,"split",NA,NA,NA,NA,"split",NA,NA,NA,NA),
B=c("Name1","text1","text2","text3",NA,"Name2","text1","text2","text3",NA,"Name3","text1","text2","text3",NA,"Name4","text1","text2","text3"),
C=c(NA,1,NA,3,NA,NA,4,5,6,NA,NA,7,8,9,NA,NA,3,3,3),D=c(NA,1,1,2,NA,NA,5,6,NA,NA,NA,9,8,7,NA,NA,2,2,2),
E=c(NA,3,2,1,NA,NA,6,5,4,NA,NA,7,7,8,NA,NA,1,NA,1))
print(x)

All I want to do is to split x, i.e., to create a list of data frames
that are currently separated by the word "split" in column A. In this
example, it would be 4 data frames, the first of them being:
A B C D E
NA Name1 NA NA NA
NA text1 1 1 3
NA text 2 NA 1 2
NA text3 3 2 1

etc.

I tried:
split(x, x$A)
split(x,x$A == 'split')
split(x,!is.na(x$A))

But nothing produces what I need.
Tanks a lot for any hint!
#
one way is the following:

ind <- rle(is.na(x$A))
ind <- rep(seq_along(ind$lengths), ind$lengths)
na.ind <- is.na(x$A)
split(x[na.ind, -1], ind[na.ind])


I hope it helps.

Best,
Dimitris
Dimitri Liakhovitski wrote:

  
    
#
Thanks a lot, Dimitris.
It totally works on my example data frame.
I know, it's probably hard to address, but when I try to apply it to
the real huge data frame I have, after the last line I get:
Error in `[.default`(x$A, na.ind, -1) :  incorrect number of dimensions.
I know it's impossible to answer this question without seeing the
data, but still: what do you think might be wrong?

Do you think it could be because my first column contains something
else but the "split"? No, I've just run the table on A and it is:
split <NA>
204 6356

I also checked the first dimension of x and the length(na.ind) - the
are the same length: 6560.

No idea where the error might lye...


Thanks a lot!
Dimitri

On Sun, Sep 6, 2009 at 5:43 AM, Dimitris
Rizopoulos<d.rizopoulos at erasmusmc.nl> wrote:

  
    
#
Found a mistake - it was mine!
Thanks a lot for your help!
On Sun, Sep 6, 2009 at 8:43 AM, Dimitri Liakhovitski<ld7631 at gmail.com> wrote: