Skip to content

how to create duplicated ID in multi-records per subject dataset

4 messages · Zhixin Liu, Andrew, Henrique Dallazuanna +1 more

#
Hi R helpers,

If I have a dataset looks like:
ID   record 
1        20
.         30
.         25
2         26
.         15
3         21
4.....................

And I want it becomes 
ID   record 
1        20
1        30
1        25
2         26
2        15
3         21
4.....................

That is, I have to duplicate IDs for those with multiple records. I am wondering it is possible to be done in R, and I am grateful if you would like to show me the direction.

Many thanks!

Zhixin
#
if the records are in the file dupIDs.txt, then when you read them in,
the IDs become factors.  Coercing them to numeric gets them to assign
a unique number to each factor.

So, you could try the following:

dupIDs <- read.table("dupIDs.txt", header = T)
dupIDs$ID2 <- cummax(as.numeric(dupIDs$ID)-1)
ID record ID2
1  1     20   1
2  .     30   1
3  .     25   1
4  2     26   2
5  .     15   2
6  3     21   3

HTH,

Andrew.
On Dec 15, 12:56?pm, "Zhixin Liu" <z... at efs.mq.edu.au> wrote:
#
You can also try ?rep and something like

dat <- read.table(textConnection("ID   record 
1        20
.         30
.         25
2         26
.         15
3         21
"),header=TRUE,na.strings=".")

ind <- !is.na(dat$ID)
id <- dat$ID[ind]
reps <- diff(c(seq_len(nrow(dat))[ind],nrow(dat)+1)) 
dat$new.id <- rep(id,reps)
dat
  ID record new.id
1  1     20      1
2 NA     30      1
3 NA     25      1
4  2     26      2
5 NA     15      2
6  3     21      3