Skip to content

Multiple lines for each record: how do I handle that

5 messages · Sarah Goslee, Jorge I Velez, Luca Meyer

#
If you provide a small reproducible example of your data format and
expected output, I'm sure someone here can offer a useful solution.

Without knowing what your data look like, not so easy.

Sarah
On Wed, Feb 22, 2012 at 2:22 PM, Luca Meyer <lucam1968 at gmail.com> wrote:
#
Sure, I am sorry I have not done that in the first place.

The datasets I have looks like:

id <- c(1,1,2,2,2,3)
v1 <- c(NA,1,NA,1,NA,1)
v2 <- as.character(c("yes","","no","","","yes"))
v3 <- as.factor(c(NA,1,NA,NA,3,2))
d0 <- data.frame(id,v1,v2,v3)
d0

What I would need is to derive a dataset that looks like:

id <- c(1,2,3)
v1 <- c(1,1,1)
v2 <- as.character(c("yes","no","yes"))
v3 <- as.factor(c(1,3,2))
d1 <- data.frame(id,v1,v2,v3)
d1

The issue is related to the need to have an automated procedure that reads in the different variable types and aggregates them accordingly as every dataset will be different from the previous in terms of number of variables and records involved.

Thank you,
Luca

Il giorno 22/feb/2012, alle ore 20.26, Sarah Goslee ha scritto: