How to pick colums from a ragged array?

Hi Stuart,

This also should get you the IDs you wanted.
new1<-id.d[duplicated(id.d[,2])|duplicated(id.d[,2],fromLast=TRUE),]
earliest <- tapply ( DATE, ID, min)? ? ? ? ? ? ? ? 
?rownames(earliest[earliest%in% new1])
#[1] "167"? "841"? "1019"
A.K.

----- Original Message -----
From: Stuart Leask <Stuart.Leask at nottingham.ac.uk>
To: Rui Barradas <ruipbarradas at sapo.pt>
Cc: "r-help at r-project.org" <r-help at r-project.org>
Sent: Tuesday, October 23, 2012 7:37 AM
Subject: Re: [R] [r] How to pick colums from a ragged array?

Thanks Rui - your initial, very elegant suggestion, has spurred me on!

1. As you noticed, my example data had no examples of duplicate first dates (DOH!) 
I have corrected this, and added a test - an ID that has a duplicate which is not the earliest DATE, but is the same DATE an earliest/duplicate for another ID.

2. Your suggestion gave me all the duplicates:

how.many? <-? ave ( id.d [ ,1], id.d [,1], id.d [,2], FUN = length)
nd.b<- id.d [ how.many? > 1,? ]

3. I can then simply make a table of earliest DATEs by ID, and then see which DATEs in this table are shared:

earliest <- tapply ( DATE, ID, min)? ? ? ? ? ? ? ? 
rownames(earliest[earliest%in%nd.b])? 

This seems to work - and it does seem exclude IDs which have a duplicate date which is the same as a minimum date for another ID.
I'm trying to work out why!

Many, many thanks for the gift of that function. I will compare the two approaches (and assume that mine is flawed!).

Stuart

************************************************

ID <- c(58,58,58,58,167,167,323,323,323,323,323,323,323
,547,794,814,814,814,814,814,814,841,841,841,841,841
,841,841,841,841,910,910,910,910,910,910,999,1019,1019
,1019)

DATE <- 
c(20060821,20061207,20080102,20090904,20040205,20040205,20051111
,20060111,20071119,20080107,20080407,20080521,20080711,20041005
,20070905,20020814,20021125,20040429,20040429,20071205,20080227
,20050421,20050421,20060428,20060602,20060816,20061025,20061129
,20070112,20070514, 19870409,19870508,19870508, 20091120,20091210
,20091224,20050503,19870508,19870508,19880330)

id.d <- cbind (ID,DATE )

how.many <- ave(id.d[,1], id.d[,1], id.d[,2], FUN = length)
nd.b<- id.d[how.many > 1, ]

earliest <- tapply? ( DATE, ID, min)? ? ? ? ? ? ? ? ? ? # table of earliest DATEs
rownames (earliest [earliest %in% nd.b ] )?  # IDs of duplicates at the earliest date for that individual. I think...

******************************************************************

-----Original Message-----
From: Rui Barradas [mailto:ruipbarradas at sapo.pt] 
Sent: 23 October 2012 12:21
To: Stuart Leask
Cc: r-help at r-project.org
Subject: Re: [R] [r] How to pick colums from a ragged array?

Hello,

Thinking again, if you just want the first/last in each ID that repeats the DATE, the following function does the job. Since there were no such cases in your data example, I've added 3 rows to the dataset.

ID <- c(58,58,58,58,167,167,323,323,323,323,323,323,323
,547,794,814,814,814,814,814,814,841,841,841,841,841
,841,841,841,841,910,910,910,910,910,910,910,910,999,1019,1019
,1019,1019)

DATE <- c(20060821,20061207,20080102,20090904,20040205,20040323,20051111
,20060111,20071119,20080107,20080407,20080521,20080711,20041005
,20070905,20020814,20021125,20040429,20040429,20071205,20080227
,20050421,20060130,20060428,20060602,20060816,20061025,20061129
,20070112,20070514,20091105,20091105,20091117,20091119,20091120,20091210
,20091224,20091224,20050503,19870508,19880223,19880330,19880330)

id.d <- cbind(ID, DATE)

getRepeat <- function(x, first = TRUE){
? ?  fun <- if(first) head else tail
? ?  sp <- split(data.frame(x), x[,1])
? ?  first.date <- tapply(x[,2], x[,1], FUN = fun, 1)
? ?  lst <- lapply(seq_along(sp), function(j) sp[[j]][,2] == first.date[j])
? ?  n <- unlist(lapply(lst, sum))
? ?  sp1 <- sp[n > 1]
? ?  i1 <- lst[n > 1]
? ?  lapply(seq_along(sp1), function(j) sp1[[j]][i1[[j]], ]) }

getRepeat(id.d)? # defaults to first = TRUE getRepeat(id.d, first = FALSE)? # to get the last ones

Hope this helps,

Rui Barradas

Em 23-10-2012 10:59, Rui Barradas escreveu:

How to pick colums from a ragged array?

Thread (16 messages)