mohinder_datta at yahoo.com
2009-Apr-09 13:48 UTC
[R] better way of recoding factors in data frame?
Hi all, I apologize in advance for the length of this post, but I wanted to make sure I was clear. I am trying to merge two dataframes that share a number of rows (but some are unique to each data frame). Each row represents a subject in a study. The problem is that sex is coded differently in the two, including the way missing values are represented. Here is an example of the merged dataframe:> myFrame2???SubjCode SubjSex? ? ? ? ? Sex 1? ? ? sub1? ? ???M? ? ? ???<NA> 2? ? ? sub2? ? ???F? ? ? ???<NA> 3? ? ? sub3? ? ???M? ? ? ???Male 4? ? ? sub4? ? ???M? ? ? ???<NA> 5? ? ? sub5? ? ???F? ? ? ???<NA> 6? ? ? sub6? ? ???F? ? ???Female 7? ? ? sub7? ? ? ? ? ? ? ???<NA> 8? ? ? sub8? ? ? ? ? ? ? ???<NA> 9? ? ? sub9? ? ? ???Not Recorded 10? ? sub10? ? ? ???Not Recorded I then apply the following:> myFrame2$SubjSex <- factor(myFrame2$SubjSex, levels = c('M','F')) > myFrame2$SubjSex <- factor(myFrame2$SubjSex, labels = c('Male','Female')) > myFrame2 <- transform(myFrame2, newSex = ifelse(is.na(SubjSex), Sex, SubjSex))...and get this:> myFrame2???SubjCode SubjSex? ? ? ? ? Sex newSex 1? ? ? sub1? ? Male? ? ? ???<NA>? ? ? 1 2? ? ? sub2? Female? ? ? ???<NA>? ? ? 2 3? ? ? sub3? ? Male? ? ? ???Male? ? ? 1 4? ? ? sub4? ? Male? ? ? ???<NA>? ? ? 1 5? ? ? sub5? Female? ? ? ???<NA>? ? ? 2 6? ? ? sub6? Female? ? ???Female? ? ? 2 7? ? ? sub7? ? <NA>? ? ? ???<NA>? ???NA 8? ? ? sub8? ? <NA>? ? ? ???<NA>? ???NA 9? ? ? sub9? ? <NA> Not Recorded? ? ? 3 10? ? sub10? ? <NA> Not Recorded? ? ? 3 I need that last column to have just 1 (Male), 2 (Female) or 0 (Missing), and the only way I've come up with seems very kludgy:> myFrame2$newSex[is.na(myFrame2$newSex)] <- 0 > myFrame2$newSex <- ifelse(myFrame2$newSex == 3, 0, myFrame2$newSex)That gives me the right values for "newSex", but I'd like to positively select for the values I want to keep, rather than negatively selecting the ones to change - I tried this:> myFrame2$newSex <- ifelse(myFrame2$newSex ==1 || myFrame2$newSex == 2, myFrame2$newSex, 0)But I just get 1 for every row in newSex. Does anyone know of a way to do this by positively selecting the values 1 and 2? Thanks, Mohinder
On Thu, 9 Apr 2009 mohinder_datta at yahoo.com wrote:> > Hi all, > > I apologize in advance for the length of this post, but I wanted to make sure > I was clear.Good strategy.>I tried this: > >> myFrame2$newSex <- ifelse(myFrame2$newSex ==1 || myFrame2$newSex == 2, myFrame2$newSex, 0) >First, you need |, not ||, because || returns just a single value. This still won't quite work, because the condition will be NA when newSex is NA (if you don't know a number then you don't know whether it is equal to 1 or 2). So, myFrame2$newSex <- ifelse(myFrame2$newSex %in% c(1,2), myFrame2$newSex, 0) or, slightly shorter myFrame2$newSex <- with(myFrame2, ifelse(newSex %in% c(1,2), newSex, 0) -thomas Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle
Thank you for your help! --- On Thu, 4/9/09, Thomas Lumley <tlumley at u.washington.edu> wrote:> From: Thomas Lumley <tlumley at u.washington.edu> > Subject: Re: [R] better way of recoding factors in data frame? > To: mohinder_datta at yahoo.com > Cc: r-help at r-project.org > Date: Thursday, April 9, 2009, 2:10 PM > On Thu, 9 Apr 2009 mohinder_datta at yahoo.com > wrote: > > > > > Hi all, > > > > I apologize in advance for the length of this post, > but I wanted to make sure > > I was clear. > > Good strategy. > > > >I tried this: > > > >> myFrame2$newSex <- ifelse(myFrame2$newSex ==1 > || myFrame2$newSex == 2, myFrame2$newSex, 0) > > > > First, you need |, not ||, because || returns just a single > value. > > This still won't quite work, because the condition will be > NA when newSex is NA (if you don't know a number then you > don't know whether it is equal to 1 or 2). > > So, > myFrame2$newSex <- ifelse(myFrame2$newSex %in% c(1,2), > myFrame2$newSex, 0) > > or, slightly shorter > myFrame2$newSex <- with(myFrame2, ifelse(newSex %in% > c(1,2), newSex, 0) > > ? ???-thomas > > Thomas Lumley??? ??? > ??? Assoc. Professor, Biostatistics > tlumley at u.washington.edu??? > University of Washington, Seattle > > >