Generating a count variable
On Jun 1, 2009, at 1:14 PM, Joseph Magagnoli wrote:
Dear All, I am practicing data manipulation and I would like to generarte a count variable. My data looks like this: Country MID 1 NA 1 0 1 0 1 1 1 0 2 0 2 1 2 0 2 0 2 0 I would like to to generate a variable that counts the periods of zeros in the MID variable for each country for example: Country MID Count 1 NA # ya' gotta put something there 1 0 1 1 0 2 1 1 0 1 0 1 2 0 1 2 1 0 2 0 1 2 0 2 2 0 3 I am used to doing my data manipulation in stata but I want to try learn to do it in R.
The rle function is generally useful for such problems. Having created
a data.frame, dd, with those elements:
rledd<- rle(paste(dd$Country,dd$MID,sep=".") )
as.vector(unlist(sapply(rledd$lengths, FUN=function(x) seq(1,x)))) -
dd$MID
[1] NA 1 2 0 1 1 0 1 2 3
> dd$Count <- as.vector(unlist(sapply(rledd$lengths, FUN=function(x)
seq(1,x))))-dd$MID
> dd
Country MID Count
1 1 NA NA
2 1 0 1
3 1 0 2
4 1 1 0
5 1 0 1
6 2 0 1
7 2 1 0
8 2 0 1
9 2 0 2
10 2 0 3
David Winsemius, MD Heritage Laboratories West Hartford, CT