Dear members, I want to create a sequence of numbers for the multiple records of individual animal in my dataset. The SAS code below will do the trick, but I want to learn to do it in R. Can anyone help ? data ht&ssn; set ht&ssn; by anml_key; if first.anml_key then do; seq_ht_rslt=0; end; seq_ht_rslt+1; Thanks in advance. Stella ___________________________________________________________________________ This message, including attachments, is confidential. If you are not the intended recipient, please contact us as soon as possible and then destroy the message. Do not copy, disclose or use the contents in any way. The recipient should check this email and any attachments for viruses and other defects. Livestock Improvement Corporation Limited and any of its subsidiaries and associates are not responsible for the consequences of any virus, data corruption, interception or unauthorised amendments to this email. Because of the many uncertainties of email transmission we cannot guarantee that a reply to this email will be received even if correctly sent. Unless specifically stated to the contrary, this email does not designate an information system for the purposes of section 11(a) of the New Zealand Electronic Transactions Act 2002.
RE : Create sequence for dataset
2 messages · ssim@lic.co.nz, Peter Dalgaard
ssim at lic.co.nz writes:
Dear members, I want to create a sequence of numbers for the multiple records of individual animal in my dataset. The SAS code below will do the trick, but I want to learn to do it in R. Can anyone help ? data ht&ssn; set ht&ssn; by anml_key; if first.anml_key then do; seq_ht_rslt=0; end; seq_ht_rslt+1; Thanks in advance.
Whoa. Who just said that SAS data step code was clearer than R? Quite a bit of implicit knowledge in that one. Here's one way (someone please think up a better name for ave()...):
x <- numeric(nrow(airquality)) ave(x, airquality$Month, FUN=function(z)seq(along=z))
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 [19] 19 20 21 22 23 24 25 26 27 28 29 30 31 1 2 3 4 5 [37] 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 [55] 24 25 26 27 28 29 30 1 2 3 4 5 6 7 8 9 10 11 [73] 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 [91] 30 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [109] 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 1 2 3 [127] 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 [145] 22 23 24 25 26 27 28 29 30 or, same basic idea but a little less cryptic:
tb <- table(airquality$Month) l <- lapply(tb, function(x)seq(length=x)) unsplit(l, airquality$Month)
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 [19] 19 20 21 22 23 24 25 26 27 28 29 30 31 1 2 3 4 5 (etc.) or, brute force and ignorance:
x <- numeric(nrow(airquality))
for (i in unique(airquality$Month)) {
+ ix <- airquality$Month == i + x[ix] <- seq(along=x[ix]) + }
x
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 [19] 19 20 21 22 23 24 25 26 27 28 29 30 31 1 2 3 4 5 .... or, going to the opposite extreme (Gabor et al. are going to try and beat me on this...):
seq.factor <- function(f) ave(rep(1,length(f)),f,FUN=cumsum) seq(as.factor(airquality$Month))
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 [19] 19 20 21 22 23 24 25 26 27 28 29 30 31 1 2 3 4 5 ....
O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907