Hi R helpers, If I have a dataset looks like: ID record 1 20 . 30 . 25 2 26 . 15 3 21 4..................... And I want it becomes ID record 1 20 1 30 1 25 2 26 2 15 3 21 4..................... That is, I have to duplicate IDs for those with multiple records. I am wondering it is possible to be done in R, and I am grateful if you would like to show me the direction. Many thanks! Zhixin
how to create duplicated ID in multi-records per subject dataset
4 messages · Zhixin Liu, Andrew, Henrique Dallazuanna +1 more
if the records are in the file dupIDs.txt, then when you read them in,
the IDs become factors. Coercing them to numeric gets them to assign
a unique number to each factor.
So, you could try the following:
dupIDs <- read.table("dupIDs.txt", header = T)
dupIDs$ID2 <- cummax(as.numeric(dupIDs$ID)-1)
dupIDs
ID record ID2 1 1 20 1 2 . 30 1 3 . 25 1 4 2 26 2 5 . 15 2 6 3 21 3 HTH, Andrew.
On Dec 15, 12:56?pm, "Zhixin Liu" <z... at efs.mq.edu.au> wrote:
Hi R helpers, If I have a dataset looks like: ID ? record 1 ? ? ? ?20 . ? ? ? ? 30 . ? ? ? ? 25 2 ? ? ? ? 26 . ? ? ? ? 15 3 ? ? ? ? 21 4..................... And I want it becomes ID ? record 1 ? ? ? ?20 1 ? ? ? ?30 1 ? ? ? ?25 2 ? ? ? ? 26 2 ? ? ? ?15 3 ? ? ? ? 21 4..................... That is, I have to duplicate IDs for those with multiple records. I am wondering it is possible to be done in R, and I am grateful if you would like to show me the direction. Many thanks! Zhixin
______________________________________________ R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20081215/8b30ade3/attachment.pl>
You can also try ?rep and something like
dat <- read.table(textConnection("ID record
1 20
. 30
. 25
2 26
. 15
3 21
"),header=TRUE,na.strings=".")
ind <- !is.na(dat$ID)
id <- dat$ID[ind]
reps <- diff(c(seq_len(nrow(dat))[ind],nrow(dat)+1))
dat$new.id <- rep(id,reps)
dat
ID record new.id
1 1 20 1
2 NA 30 1
3 NA 25 1
4 2 26 2
5 NA 15 2
6 3 21 3
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Zhixin Liu Sent: Sunday, December 14, 2008 7:57 PM To: r-help at r-project.org Subject: [R] how to create duplicated ID in multi-records per subject dataset Hi R helpers, If I have a dataset looks like: ID record 1 20 . 30 . 25 2 26 . 15 3 21 4..................... And I want it becomes ID record 1 20 1 30 1 25 2 26 2 15 3 21 4..................... That is, I have to duplicate IDs for those with multiple records. I am wondering it is possible to be done in R, and I am grateful if you would like to show me the direction. Many thanks! Zhixin
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.