Hi:
I have two variables in a data frame that are the results of a wording experiment in a survey. I'd like to create a third variable that combines the two variables. Recode doesn't seem to work, because it just recodes the first variable into the third, then recodes the second variable into the third, overwriting the first recode. I can do this with a rather elaborate indexing process, subsetting the first column and then copying the data into the second etc. But I'm looking for a cleaner way to do this. The data frame looks like this.
df<-data.frame(var1=sample(c('a','b','c',NA),replace=TRUE, size=100), var2=sample(c('a','b','c',NA),replace=TRUE,size=100))
df<-subset(df, !is.na(var1) |!is.na(var2))
As you can see, if one variable has an NA, then the other variable has a valid value, so how do I just combine the two variables into one?
Thank you for your assistance.
Simon Kiss
Combine two variables
4 messages · Simon Kiss, Rui Barradas, arun +1 more
Hello, Inline. Em 11-09-2012 15:57, Simon Kiss escreveu:
Hi:
I have two variables in a data frame that are the results of a wording experiment in a survey. I'd like to create a third variable that combines the two variables. Recode doesn't seem to work, because it just recodes the first variable into the third, then recodes the second variable into the third, overwriting the first recode. I can do this with a rather elaborate indexing process, subsetting the first column and then copying the data into the second etc. But I'm looking for a cleaner way to do this. The data frame looks like this.
df<-data.frame(var1=sample(c('a','b','c',NA),replace=TRUE, size=100), var2=sample(c('a','b','c',NA),replace=TRUE,size=100))
df<-subset(df, !is.na(var1) |!is.na(var2))
As you can see, if one variable has an NA, then the other variable has a valid value,
No, not necessarily. You are using sample() and there's no reason to
believe the sampled values for var1 and var2 are going to be different.
My first try gave me several rows with both columns NA. Then I've used
set.seed() and it became reproducible.
set.seed(1)
df1 <- data.frame(var1=sample(c('a','b','c',NA), replace=TRUE, size=100),
var2=sample(c('a','b','c',NA), replace=TRUE, size=100))
sum(is.na(df1$var1) & is.na(df1$var2)) # 8
So I suppose this is not the case with your real dataset.
Try the following.
df1$var3 <- df1$var1
df1$var3[is.na(df1$var1)] <- df1$var2[is.na(df1$var1)]
Hope this helps,
Rui Barradas
so how do I just combine the two variables into one? Thank you for your assistance. Simon Kiss
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi, ? I am not sure how you describe combine. Try this: df1<-subset(df, !is.na(var1) &!is.na(var2)) df1$new<-paste0(df1$var1,df1$var2)
head(df1)
#? var1 var2 new
#1??? b??? a? ba
#2??? c??? b? cb
#3??? b??? b? bb
#5??? a??? a? aa
#6??? b??? b? bb
#7??? a??? b? ab
A.K.
----- Original Message -----
From: Simon Kiss <sjkiss at gmail.com>
To: r-help at r-project.org
Cc:
Sent: Tuesday, September 11, 2012 10:57 AM
Subject: [R] Combine two variables
Hi:
I have two variables in a data frame that are the results of a wording experiment in a survey. I'd like to create a third variable that combines the two variables.? Recode doesn't seem to work, because it just recodes the first variable into the third, then recodes the second variable into the third, overwriting the first recode. I can do this with a rather elaborate indexing process, subsetting the first column and then copying the data into the second etc. But I'm looking for a cleaner way to do this. The data frame looks like this.
df<-data.frame(var1=sample(c('a','b','c',NA),replace=TRUE, size=100), var2=sample(c('a','b','c',NA),replace=TRUE,size=100))
df<-subset(df, !is.na(var1) |!is.na(var2))
As you can see, if one variable has an NA, then the other variable has a valid value, so how do I just combine the two variables into one?
Thank you for your assistance.
Simon Kiss
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Hi I am not sure I understand correctly. In the sample dataframe you posted, the values in columns are different so based on what you did write I aasume that apply(df,1, paste, collapse="") gives you third variable combined from those 2 variables. If you want to select non NA value from any variable, which one will you select when there is no NA in some row? Regards Petr
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
project.org] On Behalf Of Simon Kiss
Sent: Tuesday, September 11, 2012 4:57 PM
To: r-help at r-project.org
Subject: [R] Combine two variables
Hi:
I have two variables in a data frame that are the results of a wording
experiment in a survey. I'd like to create a third variable that
combines the two variables. Recode doesn't seem to work, because it
just recodes the first variable into the third, then recodes the second
variable into the third, overwriting the first recode. I can do this
with a rather elaborate indexing process, subsetting the first column
and then copying the data into the second etc. But I'm looking for a
cleaner way to do this. The data frame looks like this.
df<-data.frame(var1=sample(c('a','b','c',NA),replace=TRUE, size=100),
var2=sample(c('a','b','c',NA),replace=TRUE,size=100))
df<-subset(df, !is.na(var1) |!is.na(var2))
As you can see, if one variable has an NA, then the other variable has
a valid value, so how do I just combine the two variables into one?
Thank you for your assistance.
Simon Kiss
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.