Skip to content
Prev 181369 / 398502 Next

long format - find age when another variable is first 'high'

On May 25, 2009, at 7:45 AM, David Freedman wrote:

            
The first thing that I would do is to get rid of records that are not  
relevant to your question:

 > d
id age ldlc high.ldlc
1  1   5  132         1
2  1  10  120         0
3  1  15  125         0
4  2   4  105         0
5  2   7  142         1
6  3  12  160         1


# Get records with high ldl
d.new <- subset(d, ldlc >= 130)


 > d.new
id age ldlc high.ldlc
1  1   5  132         1
5  2   7  142         1
6  3  12  160         1


That will help to reduce the total size of the dataset, perhaps  
substantially. It will also remove entire subjects that are not  
relevant (eg. never have LDL >= 130).

Then get the minimum age for each of the remaining subjects:

 > aggregate(d.new$age, list(id = d.new$id), min)
id  x
1  1  5
2  2  7
3  3 12


Try that to see what sort of time reduction you observe.

HTH,

Marc Schwartz