Selecting First Incidence from Longitudinal Data
I think we need a task view on longitudinal data manipulation. There are so
many approaches to this - people need help navigating them.
I tend to stay away from the lapply-split methods as they don't look quite
as clean and may take longer to run. The aggregate function uses too much
data frame subscripting. The plyr package and the mApply function in the
Hmisc package provide some other nice solutions. Often I like to stick with
tapply using constructs like
with(mydata, tapply(1:nrow(mydata), subjectID, function(i) {... operate on
variables in mydata subscripted by [i] ...)))
Frank
arun kirshna wrote
Hi, I am not sure why you are getting different results.? I couldn't reproduce your problem. dat1<- read.table(text=" ID??? COMPL? SEX? HEREDITY 1??? 0????? 1????? 2 1??? 0????? 1????? 2 1??? 3????? 1????? 2 2??? 0????? 0????? 1 2??? 1????? 0????? 1 2??? 2????? 0????? 1 2??? 2????? 0????? 1 3??? 0????? 0????? 1 3??? 0????? 0????? 1 3??? 0????? 0????? 1 3??? 0????? 0????? 1 3??? 2????? 0????? 1 4??? 0????? 1????? 2 4??? 0????? 1????? 2 ",sep="",header=TRUE) do.call(rbind,lapply(split(dat1,dat1$ID),function(x) if(any(x$COMPL!=0)) head(x[x$COMPL!=0,],1) else head(x,1))) #? ID COMPL SEX HEREDITY #1? 1???? 3?? 1??????? 2 #2? 2???? 1?? 0??????? 1 #3? 3???? 2?? 0??????? 1 #4? 4???? 0?? 1??????? 2 You could also try: dat1[with(dat1,ave(COMPL,ID,FUN=function(x) if(any(x!=0)) cumsum(x>0) else seq_along(x)))==1,] #modification of David's code #?? ID COMPL SEX HEREDITY #3?? 1???? 3?? 1??????? 2 #5?? 2???? 1?? 0??????? 1 #12? 3???? 2?? 0??????? 1 #13? 4???? 0?? 1??????? 2 A.K.
________________________________ From: Tasnuva Tabassum <
t.tasnuva@
> To: arun <
smartpink111@
> Sent: Sunday, February 24, 2013 12:08 AM Subject: Re: [R] Selecting First Incidence from Longitudinal Data sorry, I tried this. But it gave me answer: ?#?? ID COMPL SEX HEREDITY #1?? 1???? 0?? 1??????? 2??????? #4?? 2???? 0?? 0??????? 1??????? #8?? 3???? 0?? 0??????? 1??????? #13? 4???? 0?? 1??????? 2??????? On Sat, Feb 23, 2013 at 8:44 PM, arun <
smartpink111@
> wrote: Hi,
Try this: #dat1 ?do.call(rbind,lapply(split(dat1,dat1$ID),function(x) if(any(x$COMPL!=0))
head(x[x$COMPL!=0,],1) else head(x,1)))
#? ID COMPL SEX HEREDITY #1? 1???? 3?? 1??????? 2 #2? 2???? 1?? 0??????? 1 #3? 3???? 2?? 0??????? 1 #4? 4???? 0?? 1??????? 2 A.K.
________________________________ From: Tasnuva Tabassum <
t.tasnuva@
>
To: Xiaogang Su <
xiaogangsu@
>
Cc: arun <
smartpink111@
>; R help <
r-help@
>; Rui Barradas <
ruipbarradas@
>
Sent: Saturday, February 23, 2013 11:23 PM Subject: Re: [R] Selecting First Incidence from Longitudinal Data Hi Thank you very much, but I forgot to tell that I also want to include the
patients for which no complication occurred. That is, for my data I want to include patient no. 4, for which the COMPL value will be 0.
In that case, what R function should I write? On Sat, Feb 23, 2013 at 12:23 PM, Xiaogang Su <
xiaogangsu@
> wrote:
My bad. I didn't try it out with the real data. Here you go. HTH, X
dat <- read.table(text=" ID ? ?COMPL ?SEX ?HEREDITY 1 ? ?0 ? ? ?1 ? ? ?2 1 ? ?0 ? ? ?1 ? ? ?2 1 ? ?3 ? ? ?1 ? ? ?2 2 ? ?0 ? ? ?0 ? ? ?1 2 ? ?1 ? ? ?0 ? ? ?1 2 ? ?2 ? ? ?0 ? ? ?1 2 ? ?2 ? ? ?0 ? ? ?1 3 ? ?0 ? ? ?0 ? ? ?1 3 ? ?0 ? ? ?0 ? ? ?1 3 ? ?0 ? ? ?0 ? ? ?1 3 ? ?0 ? ? ?0 ? ? ?1 3 ? ?2 ? ? ?0 ? ? ?1 4 ? ?0 ? ? ?1 ? ? ?2 4 ? ?0 ? ? ?1 ? ? ?2 ", header = TRUE) dat0 <- dat[dat$COMPL!=0, ] dat0$sequence <- as.vector(unlist(lapply(aggregate(dat0$ID,
by=list(dat0$ID),FUN=length)$x, FUN=function(x){seq(1, x)})))
dat0 <- dat0[dat0$sequence==1, ]? dat0 On Sat, Feb 23, 2013 at 2:09 PM, arun <
smartpink111@
> wrote:
HI,
Tried your approach: ?dat1$sequence <- as.vector(unlist(lapply( aggregate(dat1$ID,
by=list(dat1$ID),FUN=length)$x, FUN=function(x){seq(1, x)})))
?dat0 <- dat1[dat1$sequence==1 & dat1$COMPL!= 0, ] #your second solution ?dat0 #[1] ID?????? COMPL??? SEX????? HEREDITY sequence #<0 rows> (or 0-length row.names) ? dat1[dat1$sequence==1,] #here the OP wanted first incidence where
COMPL!=0
#?? ID COMPL SEX HEREDITY sequence #1?? 1???? 0?? 1??????? 2??????? 1 #4?? 2???? 0?? 0??????? 1??????? 1 #8?? 3???? 0?? 0??????? 1??????? 1 #13? 4???? 0?? 1??????? 2??????? 1 A.K. ----- Original Message ----- From: Xiaogang Su <
xiaogangsu@
>
To: Rui Barradas <
ruipbarradas@
>
Cc:
r-help@
Sent: Saturday, February 23, 2013 2:15 PM
Subject: Re: [R] Selecting First Incidence from Longitudinal Data
Try this:
dat$sequence <- as.vector(unlist(lapply( aggregate(dat$ID, by=list(x),
FUN=length)$x, FUN=function(x){seq(1, x))))
dat0 <- dat[dat$sequence==1, ]
HTH, X
On Sat, Feb 23, 2013 at 1:07 PM, Rui Barradas <
ruipbarradas@
> wrote:
Hello, You can use ?aggregate and ?head to do what you want. Try the following. dat <- read.table(text=" ID? ? COMPL? SEX? HEREDITY 1? ? 0? ? ? 1? ? ? 2 1? ? 0? ? ? 1? ? ? 2 1? ? 3? ? ? 1? ? ? 2 2? ? 0? ? ? 0? ? ? 1 2? ? 1? ? ? 0? ? ? 1 2? ? 2? ? ? 0? ? ? 1 2? ? 2? ? ? 0? ? ? 1 3? ? 0? ? ? 0? ? ? 1 3? ? 0? ? ? 0? ? ? 1 3? ? 0? ? ? 0? ? ? 1 3? ? 0? ? ? 0? ? ? 1 3? ? 2? ? ? 0? ? ? 1 4? ? 0? ? ? 1? ? ? 2 4? ? 0? ? ? 1? ? ? 2 ", header = TRUE) aggregate(. ~ ID, data = subset(dat, COMPL != 0), head, 1) Hope this helps, Rui Barradas Em 23-02-2013 14:28, Tasnuva Tabassum escreveu: ? I have a longitudinal competing risk data of the form:
ID? ? COMPL? SEX? ?HEREDITY 1? ? ?0? ? ? ?1? ? ? 2 1? ? ?0? ? ? ?1? ? ? 2 1? ? ?3? ? ? ?1? ? ? 2 2? ? ?0? ? ? ?0? ? ? 1 2? ? ?1? ? ? ?0? ? ? 1 2? ? ?2? ? ? ?0? ? ? 1 2? ? ?2? ? ? ?0? ? ? 1 3? ? ?0? ? ? ?0? ? ? 1 3? ? ?0? ? ? ?0? ? ? 1 3? ? ?0? ? ? ?0? ? ? 1 3? ? ?0? ? ? ?0? ? ? 1 3? ? ?2? ? ? ?0? ? ? 1 4? ? ?0? ? ? ?1? ? ? 2 4? ? ?0? ? ? ?1? ? ? 2. Where, COMPL= health complication of diabetic patients which has value labels? ?as? 0= no complication,1=coronary heart disease, 2=retinopathy, 3= nephropathy. I want to select only the first complication that occurred to each patient. What R function can I use? ? ? ? ? ?[[alternative HTML version deleted]]
______________________________**________________
R-help@
mailing list
https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html <http://www.R-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible code.
______________________________**________________
R-help@
mailing list
https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html <http://www.R-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible code.
-- ============================== Xiaogang Su, Ph.D. Associate Professor & Statistician School of Nursing, University of Alabama Birmingham, AL 35294-1210 (205) 934-2355?[Office]
xgsu@
xiaogangsu@
https://sites.google.com/site/xgsu00/ ??? [[alternative HTML version deleted]]
______________________________________________
R-help@
mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
-- ============================== Xiaogang Su, Ph.D. Associate Professor & Statistician School of Nursing, University of Alabama Birmingham, AL 35294-1210 (205) 934-2355 [Office]
xgsu@
xiaogangsu@
?
______________________________________________
R-help@
mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Selecting-First-Incidence-from-Longitudinal-Data-tp4659455p4659530.html Sent from the R help mailing list archive at Nabble.com.