Skip to content
Prev 70308 / 398506 Next

obtaining first and last record for rows with same identifier

If you have your data.frame ordered by the patid, you can use the 
function rle in combination with cumsum.  As a vector example:

 > a <- rep(c('a','b','c'),10)
 > a
  [1] "a" "b" "c" "a" "b" "c" "a" "b" "c" "a" "b" "c" "a" "b" "c" "a" 
"b" "c" "a"
[20] "b" "c" "a" "b" "c" "a" "b" "c" "a" "b" "c"
 > b <- a[order(a)]
 > b
  [1] "a" "a" "a" "a" "a" "a" "a" "a" "a" "a" "b" "b" "b" "b" "b" "b" 
"b" "b" "b"
[20] "b" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c"
 > l <- rle(b)$length
 > cbind(l,cumsum(l),cumsum(l)-l+1)
       l
[1,] 10 10  1
[2,] 10 20 11
[3,] 10 30 21

# use the line below to get the length of the block of the dataframe, 
the start, and then end indices
 > cbind(l,cumsum(l)-l+1,cumsum(l))
       l
[1,] 10  1 10
[2,] 10 11 20
[3,] 10 21 30
 >

Sean
On May 24, 2005, at 2:27 PM, sms13+ at pitt.edu wrote: