Skip to content

Combine two variables

4 messages · Simon Kiss, Rui Barradas, arun +1 more

#
Hi:
I have two variables in a data frame that are the results of a wording experiment in a survey. I'd like to create a third variable that combines the two variables.  Recode doesn't seem to work, because it just recodes the first variable into the third, then recodes the second variable into the third, overwriting the first recode. I can do this with a rather elaborate indexing process, subsetting the first column and then copying the data into the second etc. But I'm looking for a cleaner way to do this. The data frame looks like this.


df<-data.frame(var1=sample(c('a','b','c',NA),replace=TRUE, size=100), var2=sample(c('a','b','c',NA),replace=TRUE,size=100))

df<-subset(df, !is.na(var1) |!is.na(var2))

As you can see, if one variable has an NA, then the other variable has a valid value, so how do I just combine the two variables into one?
Thank you for your assistance.
Simon Kiss
#
Hello,

Inline.

Em 11-09-2012 15:57, Simon Kiss escreveu:
No, not necessarily. You are using sample() and there's no reason to 
believe the sampled values for var1 and var2 are going to be different. 
My first try gave me several rows with both columns NA. Then I've used 
set.seed() and it became reproducible.

set.seed(1)
df1 <- data.frame(var1=sample(c('a','b','c',NA), replace=TRUE, size=100),
     var2=sample(c('a','b','c',NA), replace=TRUE, size=100))
sum(is.na(df1$var1) & is.na(df1$var2))  # 8

So I suppose this is not the case with your real dataset.
Try the following.

df1$var3 <- df1$var1
df1$var3[is.na(df1$var1)] <- df1$var2[is.na(df1$var1)]


Hope this helps,

Rui Barradas
#
Hi,
? I am not sure how you describe combine. 

Try this:
df1<-subset(df, !is.na(var1) &!is.na(var2))

df1$new<-paste0(df1$var1,df1$var2)
#? var1 var2 new
#1??? b??? a? ba
#2??? c??? b? cb
#3??? b??? b? bb
#5??? a??? a? aa
#6??? b??? b? bb
#7??? a??? b? ab
A.K.



----- Original Message -----
From: Simon Kiss <sjkiss at gmail.com>
To: r-help at r-project.org
Cc: 
Sent: Tuesday, September 11, 2012 10:57 AM
Subject: [R] Combine two variables

Hi:
I have two variables in a data frame that are the results of a wording experiment in a survey. I'd like to create a third variable that combines the two variables.? Recode doesn't seem to work, because it just recodes the first variable into the third, then recodes the second variable into the third, overwriting the first recode. I can do this with a rather elaborate indexing process, subsetting the first column and then copying the data into the second etc. But I'm looking for a cleaner way to do this. The data frame looks like this.


df<-data.frame(var1=sample(c('a','b','c',NA),replace=TRUE, size=100), var2=sample(c('a','b','c',NA),replace=TRUE,size=100))

df<-subset(df, !is.na(var1) |!is.na(var2))

As you can see, if one variable has an NA, then the other variable has a valid value, so how do I just combine the two variables into one?
Thank you for your assistance.
Simon Kiss

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Hi

I am not sure I understand correctly. In the sample dataframe you posted, the values in columns are different so based on what you did write I aasume that

apply(df,1, paste, collapse="")

gives you third variable combined from those 2 variables.

If you want to select non NA value from any variable, which one will you select when there is no NA in some row?

Regards
Petr