Skip to content

Using unsplit - unsplit does not seem to reverse the effect of split

3 messages · Søren Højsgaard, Peter Dalgaard, Marc Schwartz (via MN)

#
In data OME in MASS I would like to extract the first 5 observations per subject (=ID). So I do
 
library(MASS)
OMEsub <- split(OME, OME$ID)
OMEsub <- lapply(OMEsub,function(x)x[1:5,])
unsplit(OMEsub, OME$ID)

- which results in 
 
[[1]]
[1] 1 1 1 1 1
[[2]]
[1] 30 30 30 30 30
[[3]]
[1] low low low low low
Levels: N/A high low
[[4]]
[1] 35 35 40 40 45
[[5]]
[1] coherent   incoherent coherent   incoherent coherent  
Levels: coherent incoherent
[[6]]
[1] 1 4 0 1 2

............
 
[[1094]]
[1] 4 5 5 5 2
[[1095]]
[1] 100 100 100 100 100
[[1096]]
[1] 18 18 18 18 18
[[1097]]
[1] N/A N/A N/A N/A N/A
Levels: N/A high low
There were 50 or more warnings (use warnings() to see the first 50)

warnings()
Warning messages:
1: number of items to replace is not a multiple of replacement length
2: number of items to replace is not a multiple of replacement length
3: number of items to replace is not a multiple of replacement length
....
 
According to documentation unsplit is the reverse of split, but I must be missing a point somewhere... Can anyone help? Thanks in advance. S??ren
#
S??ren H??jsgaard <Soren.Hojsgaard at agrsci.dk> writes:
It only works if the first argument is or could have resulted from a
split on the second argument. That is clearly not the case when you
are creating subvectors. 

I have on occasion wanted an unsplit that worked without the 2nd
argument as in

unsplit(l, rep(seq(along=l), sapply(l,length)) )

but if you think about it, it's not really doing anything that
do.call("c",l) or do.call("rbind",l) won't do.
#
On Tue, 2005-09-27 at 19:12 +0200, SÃ¸ren HÃ¸jsgaard wrote:
If you read the documentation for split/unsplit, you will also note that
in the Details section it says:

 'unsplit' works only with lists of vectors

as opposed to lists of data frames, which is the result of your split()
operation.

Also note that in the Value section, it indicates:

'unsplit' returns a vector for which 'split(x, f)' equals 'value'

as opposed to unsplit returning a data frame.


Thus, use:

  OME1 <- do.call("rbind", OMEsub)

where OME1 will be the result of rbind()'ing the data frames in the
OMEsub list.

See ?do.call for more information.

HTH,

Marc Schwartz