Niklas Fischer
2012-Sep-15 20:36 UTC
[R] create new variable with ifelse? (reproducible example)
Dear R users, I have a reproducible data and try to create new variable "clo" is 1 if know variable is equal to "very well" or "fairly well" and getalong is 4 or 5 otherwise it is 0. rep_data<- read.table(header=TRUE, text=" id1 id2 know getalong 100000016_a1 100000016_a2 very well 4 100000035_a1 100000035_a2 fairly well NA 100000036_a1 100000036_a2 very well 3 100000039_a1 100000039_a2 very well 5 100000067_a1 100000067_a2 very well 5 100000076_a1 100000076_a2 fairly well 5 ") rep_data$clo<- ifelse((rep_data$know==c("fairly well","very well") & rep_data$getalong==c(4,5)),1,0) For sure, something must be wrong, I couldn't find it out. rep_data id1 id2 know getalong clo 100000016_a1 100000016_a2 very well 4 0 100000035_a1 100000035_a2 fairly well NA 0 100000036_a1 100000036_a2 very well 3 0 100000039_a1 100000039_a2 very well 5 0 100000067_a1 100000067_a2 very well 5 0 100000076_a1 100000076_a2 fairly well 5 0 Any help is appreciated.. Bests, Niklas [[alternative HTML version deleted]]
Hi, Try this:> str(rep_data)#'data.frame':??? 6 obs. of? 4 variables: # $ id1???? : Factor w/ 6 levels "100000016_a2",..: 1 2 3 4 5 6 # $ id2???? : Factor w/ 2 levels "fairly","very": 2 1 2 2 2 1 # $ know??? : Factor w/ 1 level "well": 1 1 1 1 1 1 # $ getalong: int? 4 NA 3 5 5 5 ?rownames(rep_data) #[1] "100000016_a1" "100000035_a1" "100000036_a1" "100000039_a1" "100000067_a1" #[6] "100000076_a1" ? rep_data$clo<-ifelse((rep_data$id2%in% c("very","fairly")) &(rep_data$getalong%in%c(4,5)),1,0) ?rep_data #????????????????????? id1??? id2 know getalong clo #100000016_a1 100000016_a2?? very well??????? 4?? 1 #100000035_a1 100000035_a2 fairly well?????? NA?? 0 #100000036_a1 100000036_a2?? very well??????? 3?? 0 #100000039_a1 100000039_a2?? very well??????? 5?? 1 #100000067_a1 100000067_a2?? very well??????? 5?? 1 #100000076_a1 100000076_a2 fairly well??????? 5?? 1 A.K. ----- Original Message ----- From: Niklas Fischer <niklasfischer980 at gmail.com> To: r-help at r-project.org Cc: Sent: Saturday, September 15, 2012 4:36 PM Subject: [R] create new variable with ifelse? (reproducible example) Dear R users, I have a reproducible data and try to create new variable "clo" is 1? if know variable is equal to "very well" or "fairly well" and getalong is 4 or 5 otherwise it is 0. rep_data<- read.table(header=TRUE, text=" ? ? ? ? ? id1? ? ? ? id2? ? ? ? know getalong ? 100000016_a1 100000016_a2? very well? ? ? ? 4 ? 100000035_a1 100000035_a2 fairly well? ? ? NA ? 100000036_a1 100000036_a2? very well? ? ? ? 3 ? 100000039_a1 100000039_a2? very well? ? ? ? 5 ? 100000067_a1 100000067_a2? very well? ? ? ? 5 ? 100000076_a1 100000076_a2 fairly well? ? ? ? 5 ") rep_data$clo<- ifelse((rep_data$know==c("fairly well","very well") & rep_data$getalong==c(4,5)),1,0) For sure, something must be wrong, I couldn't find it out. rep_data ? ? ? ? ? ? ? ? ? ? ? id1? ? id2 know getalong clo 100000016_a1 100000016_a2? very well? ? ? ? 4? 0 100000035_a1 100000035_a2 fairly well? ? ? NA? 0 100000036_a1 100000036_a2? very well? ? ? ? 3? 0 100000039_a1 100000039_a2? very well? ? ? ? 5? 0 100000067_a1 100000067_a2? very well? ? ? ? 5? 0 100000076_a1 100000076_a2 fairly well? ? ? ? 5? 0 Any help is appreciated.. Bests, Niklas ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Milan Bouchet-Valat
2012-Sep-15 21:10 UTC
[R] create new variable with ifelse? (reproducible example)
Le samedi 15 septembre 2012 ? 23:36 +0300, Niklas Fischer a ?crit :> Dear R users, > > I have a reproducible data and try to create new variable "clo" is 1 if > know variable is equal to "very well" or "fairly well" and getalong is 4 or > 5 > otherwise it is 0. > > rep_data<- read.table(header=TRUE, text=" > id1 id2 know getalong > 100000016_a1 100000016_a2 very well 4 > 100000035_a1 100000035_a2 fairly well NA > 100000036_a1 100000036_a2 very well 3 > 100000039_a1 100000039_a2 very well 5 > 100000067_a1 100000067_a2 very well 5 > 100000076_a1 100000076_a2 fairly well 5 > ") > > > rep_data$clo<- ifelse((rep_data$know==c("fairly well","very well") & > rep_data$getalong==c(4,5)),1,0) > > For sure, something must be wrong, I couldn't find it out.Try: rep_data$clo <- ifelse(rep_data$know %in% c("fairly well", "very well") & rep_data$getalong %in% c(4, 5), 1, 0) (Not checked, because your data is not parsed correclty by read.table because of the spaces in the levels. Please use dput() instead.) My two cents
(Ted Harding)
2012-Sep-15 22:02 UTC
[R] create new variable with ifelse? (reproducible example)
[See at end] On 15-Sep-2012 20:36:49 Niklas Fischer wrote:> Dear R users, > > I have a reproducible data and try to create new variable "clo" is 1 if > know variable is equal to "very well" or "fairly well" and getalong is 4 or > 5 > otherwise it is 0.>[A]rep_data<- read.table(header=TRUE, text=" id1 id2 know getalong 100000016_a1 100000016_a2 very well 4 100000035_a1 100000035_a2 fairly well NA 100000036_a1 100000036_a2 very well 3 100000039_a1 100000039_a2 very well 5 100000067_a1 100000067_a2 very well 5 100000076_a1 100000076_a2 fairly well 5 ") rep_data$clo<- ifelse((rep_data$know==c("fairly well","very well") & rep_data$getalong==c(4,5)),1,0)> For sure, something must be wrong, I couldn't find it out.rep_data id1 id2 know getalong clo 100000016_a1 100000016_a2 very well 4 0 100000035_a1 100000035_a2 fairly well NA 0 100000036_a1 100000036_a2 very well 3 0 100000039_a1 100000039_a2 very well 5 0 100000067_a1 100000067_a2 very well 5 0 100000076_a1 100000076_a2 fairly well 5 0> Any help is appreciated.. > Bests, > NiklasThere are several things wrong with the way you are trying to do it, and indeed it is a bit complicated! First: if the above table (at >[A] above) is the format in which you input the data, then you should either comma-separate your data fields (and use sep="," in read.table(), or else just use read.csv()), or else enclose the two-word fields within "...", i.e. EITHER:>[B]id1, id2, know, getalong 100000016_a1, 100000016_a2, very well, 4 100000035_a1, 100000035_a2, fairly well, NA 100000036_a1, 100000036_a2, very well, 3 100000039_a1, 100000039_a2, very well, 5 100000067_a1, 100000067_a2, very well, 5 100000076_a1, 100000076_a2, fairly well, 5 OR:>[C]id1 id2 know getalong 100000016_a1 100000016_a2 "very well" 4 100000035_a1 100000035_a2 "fairly well" NA 100000036_a1 100000036_a2 "very well" 3 100000039_a1 100000039_a2 "very well" 5 100000067_a1 100000067_a2 "very well" 5 100000076_a1 100000076_a2 "fairly well" 5 Otherwise, in your original format, read.table() will read in FIVE fields, since it will treat "very" and "well" as separate, and will treat "fairly" and "well" as separate. Furthermore, it will match the header "getalong" with the 5th field (4,NA,etc), the header "know" with the 4th field ("well","well",...,"well"), header "id2" with the 3rd field ("very","fairly","very",...,"fairly"), and header "id1" with the 2nd field ("100000016_a2"). And even further more, the first field will become the row-names of the dataframe and will no longer be data! Second: Use of "==" to compare $know with "very well" and "fairly well" will not work as you expect. In your comparison rep_data$know==c("fairly well","very well") you will get the result: # [1] FALSE FALSE FALSE TRUE FALSE FALSE rather then your expected # [1] TRUE TRUE TRUE TRUE TRUE TRUE. This is because "==" will compare $know with ONE ELEMENT of c("fairly well","very well"), and will recycle these elements, so it will compare $know successively with "fairly well","very well" "fairly well","very well" "fairly well","very well" and since $know is "very well","fairly well","very well","very well","very well","fairly well" the only match is in the 4th instance, which is why you get # [1] FALSE FALSE FALSE TRUE FALSE FALSE A better comparison is to use the "%in" operator, as in: rep_data$know %in% c("fairly well","very well") # [1] TRUE TRUE TRUE TRUE TRUE TRUE so you can in the end do: rep_data$clo<- ifelse((rep_data$know %in% c("fairly well","very well")) & (rep_data$getalong %in% c(4,5)),1,0) which results in: rep_data # id1 id2 know getalong clo # 1 100000016_a1 100000016_a2 very well 4 1 # 2 100000035_a1 100000035_a2 fairly well NA 0 # 3 100000036_a1 100000036_a2 very well 3 0 # 4 100000039_a1 100000039_a2 very well 5 1 # 5 100000067_a1 100000067_a2 very well 5 1 # 6 100000076_a1 100000076_a2 fairly well 5 1 Finally, I suppose it is a happy coincidence that NA %in% c(4,5) yields FALSE rather than what R might have been written to yield, i.e. NA -- since NA is basically a synonym for "something that we do not know the value of", strictly speaking we do not know the value of NA %in% c(4,5). It is possible that the "something that we do not know the value of" could be either 4 or 5, in which case NA %in% c(4,5) would be TRUE; but it is also possible that the "something that we do not know the value of" could be neither 4 nor 5, in which case NA %in% c(4,5) would be FALSE; but since we do not know which of these possibilities is the case, we do not know whether it should be TRUE or FALSE, so one can argue that the result should itself be NA. But, as it happens, 3 %in% c(4,5) # [1] FALSE 4 %in% c(4,5) # [1] TRUE 5 %in% c(4,5) # [1] TRUE NA %in% c(3,4) # [1] FALSE so all is well! Hoping this helps, Ted. ------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at wlandres.net> Date: 15-Sep-2012 Time: 23:02:14 This message was sent by XFMail
Stephen Politzer-Ahles
2012-Sep-16 12:29 UTC
[R] create new variable with ifelse? (reproducible example)
Hi Niklas, I like A.K.'s method. Here's another way to do what I think is the same thing you're asking for (this is how I did it before I knew ifelse() existed!) rep_data$clo <- 0 rep_data[ rep_data$know %in% c("very well", "fairly well") & rep_data$getalong %in% c(4,5),]$clo <- 1 Best, Steve ------------------------------ Message: 25 Date: Sat, 15 Sep 2012 23:36:49 +0300 From: Niklas Fischer <niklasfischer980@gmail.com> To: r-help@r-project.org Subject: [R] create new variable with ifelse? (reproducible example) Message-ID: <CADWGO2zANM_UK8qf=JLZHRSqgtPC=NX+rU2kXx=1etw0uQvxRg@mail.gmail.com> Content-Type: text/plain Dear R users, I have a reproducible data and try to create new variable "clo" is 1 if know variable is equal to "very well" or "fairly well" and getalong is 4 or 5 otherwise it is 0. rep_data<- read.table(header=TRUE, text=" id1 id2 know getalong 100000016_a1 100000016_a2 very well 4 100000035_a1 100000035_a2 fairly well NA 100000036_a1 100000036_a2 very well 3 100000039_a1 100000039_a2 very well 5 100000067_a1 100000067_a2 very well 5 100000076_a1 100000076_a2 fairly well 5 ") rep_data$clo<- ifelse((rep_data$know==c("fairly well","very well") & rep_data$getalong==c(4,5)),1,0) For sure, something must be wrong, I couldn't find it out. rep_data id1 id2 know getalong clo 100000016_a1 100000016_a2 very well 4 0 100000035_a1 100000035_a2 fairly well NA 0 100000036_a1 100000036_a2 very well 3 0 100000039_a1 100000039_a2 very well 5 0 100000067_a1 100000067_a2 very well 5 0 100000076_a1 100000076_a2 fairly well 5 0 Any help is appreciated.. Bests, Niklas [[alternative HTML version deleted]]
Rui Barradas
2012-Sep-16 13:11 UTC
[R] create new variable with ifelse? (reproducible example)
Hello, Here's another one. logic.result <- with(rep_data, know %in% c("very well", "fairly well") & getalong %in% c(4,5)) rep_data$clo <- 1*logic.result # coerce to numeric Rui Barradas Em 16-09-2012 13:29, Stephen Politzer-Ahles escreveu:> Hi Niklas, > > I like A.K.'s method. Here's another way to do what I think is the same > thing you're asking for (this is how I did it before I knew ifelse() > existed!) > > rep_data$clo <- 0 > rep_data[ rep_data$know %in% c("very well", "fairly well") & > rep_data$getalong %in% c(4,5),]$clo <- 1 > > Best, > Steve > > ------------------------------ > > Message: 25 > Date: Sat, 15 Sep 2012 23:36:49 +0300 > From: Niklas Fischer <niklasfischer980 at gmail.com> > To: r-help at r-project.org > Subject: [R] create new variable with ifelse? (reproducible example) > Message-ID: > <CADWGO2zANM_UK8qf=JLZHRSqgtPC=NX+rU2kXx=1etw0uQvxRg at mail.gmail.com> > Content-Type: text/plain > > Dear R users, > > I have a reproducible data and try to create new variable "clo" is 1 if > know variable is equal to "very well" or "fairly well" and getalong is 4 or > 5 > otherwise it is 0. > > rep_data<- read.table(header=TRUE, text=" > id1 id2 know getalong > 100000016_a1 100000016_a2 very well 4 > 100000035_a1 100000035_a2 fairly well NA > 100000036_a1 100000036_a2 very well 3 > 100000039_a1 100000039_a2 very well 5 > 100000067_a1 100000067_a2 very well 5 > 100000076_a1 100000076_a2 fairly well 5 > ") > > > rep_data$clo<- ifelse((rep_data$know==c("fairly well","very well") & > rep_data$getalong==c(4,5)),1,0) > > For sure, something must be wrong, I couldn't find it out. > > rep_data > > id1 id2 know getalong clo > 100000016_a1 100000016_a2 very well 4 0 > 100000035_a1 100000035_a2 fairly well NA 0 > 100000036_a1 100000036_a2 very well 3 0 > 100000039_a1 100000039_a2 very well 5 0 > 100000067_a1 100000067_a2 very well 5 0 > 100000076_a1 100000076_a2 fairly well 5 0 > > Any help is appreciated.. > Bests, > Niklas > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.