Hi: I have two variables in a data frame that are the results of a wording experiment in a survey. I'd like to create a third variable that combines the two variables. Recode doesn't seem to work, because it just recodes the first variable into the third, then recodes the second variable into the third, overwriting the first recode. I can do this with a rather elaborate indexing process, subsetting the first column and then copying the data into the second etc. But I'm looking for a cleaner way to do this. The data frame looks like this. df<-data.frame(var1=sample(c('a','b','c',NA),replace=TRUE, size=100), var2=sample(c('a','b','c',NA),replace=TRUE,size=100)) df<-subset(df, !is.na(var1) |!is.na(var2)) As you can see, if one variable has an NA, then the other variable has a valid value, so how do I just combine the two variables into one? Thank you for your assistance. Simon Kiss
Hello, Inline. Em 11-09-2012 15:57, Simon Kiss escreveu:> Hi: > I have two variables in a data frame that are the results of a wording experiment in a survey. I'd like to create a third variable that combines the two variables. Recode doesn't seem to work, because it just recodes the first variable into the third, then recodes the second variable into the third, overwriting the first recode. I can do this with a rather elaborate indexing process, subsetting the first column and then copying the data into the second etc. But I'm looking for a cleaner way to do this. The data frame looks like this. > > > df<-data.frame(var1=sample(c('a','b','c',NA),replace=TRUE, size=100), var2=sample(c('a','b','c',NA),replace=TRUE,size=100)) > > df<-subset(df, !is.na(var1) |!is.na(var2)) > > As you can see, if one variable has an NA, then the other variable has a valid value,No, not necessarily. You are using sample() and there's no reason to believe the sampled values for var1 and var2 are going to be different. My first try gave me several rows with both columns NA. Then I've used set.seed() and it became reproducible. set.seed(1) df1 <- data.frame(var1=sample(c('a','b','c',NA), replace=TRUE, size=100), var2=sample(c('a','b','c',NA), replace=TRUE, size=100)) sum(is.na(df1$var1) & is.na(df1$var2)) # 8 So I suppose this is not the case with your real dataset. Try the following. df1$var3 <- df1$var1 df1$var3[is.na(df1$var1)] <- df1$var2[is.na(df1$var1)] Hope this helps, Rui Barradas> so how do I just combine the two variables into one? > Thank you for your assistance. > Simon Kiss > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi, ? I am not sure how you describe combine. Try this: df1<-subset(df, !is.na(var1) &!is.na(var2)) df1$new<-paste0(df1$var1,df1$var2)> head(df1)#? var1 var2 new #1??? b??? a? ba #2??? c??? b? cb #3??? b??? b? bb #5??? a??? a? aa #6??? b??? b? bb #7??? a??? b? ab A.K. ----- Original Message ----- From: Simon Kiss <sjkiss at gmail.com> To: r-help at r-project.org Cc: Sent: Tuesday, September 11, 2012 10:57 AM Subject: [R] Combine two variables Hi: I have two variables in a data frame that are the results of a wording experiment in a survey. I'd like to create a third variable that combines the two variables.? Recode doesn't seem to work, because it just recodes the first variable into the third, then recodes the second variable into the third, overwriting the first recode. I can do this with a rather elaborate indexing process, subsetting the first column and then copying the data into the second etc. But I'm looking for a cleaner way to do this. The data frame looks like this. df<-data.frame(var1=sample(c('a','b','c',NA),replace=TRUE, size=100), var2=sample(c('a','b','c',NA),replace=TRUE,size=100)) df<-subset(df, !is.na(var1) |!is.na(var2)) As you can see, if one variable has an NA, then the other variable has a valid value, so how do I just combine the two variables into one? Thank you for your assistance. Simon Kiss ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi I am not sure I understand correctly. In the sample dataframe you posted, the values in columns are different so based on what you did write I aasume that apply(df,1, paste, collapse="") gives you third variable combined from those 2 variables. If you want to select non NA value from any variable, which one will you select when there is no NA in some row? Regards Petr> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Simon Kiss > Sent: Tuesday, September 11, 2012 4:57 PM > To: r-help at r-project.org > Subject: [R] Combine two variables > > Hi: > I have two variables in a data frame that are the results of a wording > experiment in a survey. I'd like to create a third variable that > combines the two variables. Recode doesn't seem to work, because it > just recodes the first variable into the third, then recodes the second > variable into the third, overwriting the first recode. I can do this > with a rather elaborate indexing process, subsetting the first column > and then copying the data into the second etc. But I'm looking for a > cleaner way to do this. The data frame looks like this. > > > df<-data.frame(var1=sample(c('a','b','c',NA),replace=TRUE, size=100), > var2=sample(c('a','b','c',NA),replace=TRUE,size=100)) > > df<-subset(df, !is.na(var1) |!is.na(var2)) > > As you can see, if one variable has an NA, then the other variable has > a valid value, so how do I just combine the two variables into one? > Thank you for your assistance. > Simon Kiss > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.