Message-ID: <971536df0905250714t702f27b0yac9c3337a971df95@mail.gmail.com>
Date: 2009-05-25T14:14:28Z
From: Gabor Grothendieck
Subject: long format - find age when another variable is first 'high'
In-Reply-To: <23706393.post@talk.nabble.com>
Depending on what you want (haven't checked the speed) you could try
this one where
we have changed the ldlc in the first row so that it has none > 130
for id=1 just to
illustrate that case as well:
> d <- data.frame(id = c(rep(1, 3), rep(2, 2), 3), age=c(5, 10, 15, 4, 7, 12),
+ ldlc=c(122, 120, 125, 105, 142, 160))
> library(sqldf)
> sqldf("select * from d left join (select id, min(age) min_age from d where ldlc > 130 group by id) using(id)")
id age ldlc min_age
1 1 5 122 <NA>
2 1 10 120 <NA>
3 1 15 125 <NA>
4 2 4 105 7.0
5 2 7 142 7.0
6 3 12 160 12.0
> # or this (which just gives the data frame of id and min_age):
> sqldf("select id, min_age from d left join (select id, min(age) min_age from d where ldlc > 130 group by id) using(id) group by id")
id min_age
1 1 <NA>
2 2 7.0
3 3 12.0
> # or this (which is similar but omits the NAs)
> sqldf("select id, min(age) from d where ldlc > 130 group by id")
id min(age)
1 2 7
2 3 12
See sqldf home page at:
http://sqldf.googlecode.com
On Mon, May 25, 2009 at 8:45 AM, David Freedman <3.14david at gmail.com> wrote:
>
> Dear R,
>
> I've got a data frame with children examined multiple times and at various
> ages. ?I'm trying to find the first age at which another variable
> (LDL-Cholesterol) is >= 130 mg/dL; for some children, this may never happen.
> I can do this with transformBy and ddply, but with 10,000 different
> children, these functions take some time on my PCs - is there a faster way
> to do this in R? ?My code on a small dataset follows.
>
> Thanks very much, David Freedman
>
> d<-data.frame(id=c(rep(1,3),rep(2,2),3),age=c(5,10,15,4,7,12),ldlc=c(132,120,125,105,142,160))
> d$high.ldlc<-ifelse(d$ldlc>=130,1,0)
> d
> library(plyr)
> d2<-ddply(d,~id,transform,plyr.minage=min(age[high.ldlc==1]));
> library(doBy)
> d2<-transformBy(~id,da=d2,doby.minage=min(age[high.ldlc==1]));
> d2
> --
> View this message in context: http://www.nabble.com/long-format---find-age-when-another-variable-is-first-%27high%27-tp23706393p23706393.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>