Skip to content
Prev 181361 / 398502 Next

long format - find age when another variable is first 'high'

Dear R, 

I've got a data frame with children examined multiple times and at various
ages.  I'm trying to find the first age at which another variable
(LDL-Cholesterol) is >= 130 mg/dL; for some children, this may never happen. 
I can do this with transformBy and ddply, but with 10,000 different
children, these functions take some time on my PCs - is there a faster way
to do this in R?  My code on a small dataset follows.  

Thanks very much, David Freedman

d<-data.frame(id=c(rep(1,3),rep(2,2),3),age=c(5,10,15,4,7,12),ldlc=c(132,120,125,105,142,160))
d$high.ldlc<-ifelse(d$ldlc>=130,1,0)
d
library(plyr)
d2<-ddply(d,~id,transform,plyr.minage=min(age[high.ldlc==1]));
library(doBy)
d2<-transformBy(~id,da=d2,doby.minage=min(age[high.ldlc==1]));
d2