Hello, I am very new to R and data analysis in general. I am trying to generate values to append to my data frame using conditional statements. I am playing with this simple example: a <- c(1:4) b <- c("meep", "foo", "meep", "foo") d <- cbind(a, b) now what I want to do is , each time there is a "meep" in column 2 of d, print "oops", else print "yay". So I wrote: for(i in seq(along=d[,2])) {if (d[i]=="meep") { print("oops")} else { print("yay")} } Result: [1] "yay" [1] "yay" [1] "yay" [1] "yay" What am I doing wrong? Furthermore, I would like to append the results to d: d$c <- for(i in seq(along=d[,2])) {if (d[i]=="meep") { print("oops")} else { print("yay")} } this doesn't really work, it just turns the whole thing into a list. . Although if: c <- NA d <- cbind(a, b, c) and I coerce d into a data.frame, run: d$c <- for(i in seq(along=d[,2])) {if (d[i]=="meep") { print("oops")} else { print("yay")} } some glint of hope appears: [1] "yay" [1] "oops" but then...... Error in if (d[i] == "meep") { : missing value where TRUE/FALSE needed In addition: Warning messages: 1: In if (d[i] == "meep") { : the condition has length > 1 and only the first element will be used 2: In if (d[i] == "meep") { : the condition has length > 1 and only the first element will be used 3: In if (d[i] == "meep") { : the condition has length > 1 and only the first element will be used To complicate things a little bit more in my real data there are 16 levels, so for each level I need to "print" a different value (that would be 16 nested ifs, and I am sure there must be a more sensible way to do this!) Thanks in advance, Laura
You are working with a matrix, so the "$" operator is not allowed (e.g., d$c). Also in your test, you have to test against the second column (e.g., d[i, 2]) try this:> a <- c(1:4) > b <- c("meep", "foo", "meep", "foo") > d <- cbind(a, b) > > > for(i in seq(along=d[,2])) {if (d[i,2]=="meep") { print("oops")}+ else { print("yay")} + } [1] "oops" [1] "yay" [1] "oops" [1] "yay"> > # put results back > d <- cbind(d, c=ifelse(d[,2] == 'meep', 'oops', 'yay')) > da b c [1,] "1" "meep" "oops" [2,] "2" "foo" "yay" [3,] "3" "meep" "oops" [4,] "4" "foo" "yay">On Sun, Apr 18, 2010 at 8:46 AM, Laura Ferrero-Miliani <laurafe@gmail.com>wrote:> Hello, > I am very new to R and data analysis in general. > I am trying to generate values to append to my data frame using > conditional statements. > I am playing with this simple example: > > a <- c(1:4) > b <- c("meep", "foo", "meep", "foo") > d <- cbind(a, b) > > now what I want to do is , each time there is a "meep" in column 2 of > d, print "oops", else print "yay". > So I wrote: > > for(i in seq(along=d[,2])) {if (d[i]=="meep") { print("oops")} > else { print("yay")} > } > > Result: > [1] "yay" > [1] "yay" > [1] "yay" > [1] "yay" > > What am I doing wrong? > > Furthermore, I would like to append the results to d: > > d$c <- for(i in seq(along=d[,2])) {if (d[i]=="meep") { print("oops")} > else { print("yay")} > } > > > this doesn't really work, it just turns the whole thing into a list. > . > Although if: > > c <- NA > d <- cbind(a, b, c) > > and I coerce d into a data.frame, run: > > d$c <- for(i in seq(along=d[,2])) {if (d[i]=="meep") { print("oops")} > else { print("yay")} > } > > > some glint of hope appears: > > > [1] "yay" > [1] "oops" > > but then...... > > > > Error in if (d[i] == "meep") { : missing value where TRUE/FALSE needed > In addition: Warning messages: > 1: In if (d[i] == "meep") { : > the condition has length > 1 and only the first element will be used > 2: In if (d[i] == "meep") { : > the condition has length > 1 and only the first element will be used > 3: In if (d[i] == "meep") { : > the condition has length > 1 and only the first element will be used > > > To complicate things a little bit more in my real data there are 16 > levels, so for each level I need to "print" a different value (that > would be 16 nested ifs, and I am sure there must be a more sensible > way to do this!) > > > Thanks in advance, > > Laura > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]]
Hey Laura, Just to add a cautionary note, in> a <- c(1:4) > b <- c("meep", "foo", "meep", "foo") > d <- cbind(a, b)d is a matrix and will only be one type of matrix. ?Since you have both integer (a) and character (b) data, it has to be at the character level. ?From the help for cbind: "The type of a matrix result determined from the highest type of any of the inputs in the hierarchy raw < logical < integer < real < complex < character < list" This means that the first column of 1:4 is now treated as character. That may not be what you were intending. ?You can get around this by creating a dataframe which can store different types of data. Best regards, Josh -- Joshua Wiley Senior in Psychology University of California, Riverside http://www.joshuawiley.com/
I would prefer version 1. Version to creates a global variable R which you do not really need since it contains the same values as d$r. In option 2, you should probably remove the variable r itself after it has been appended to d. On 4/18/2010 5:23 PM, Laura Ferrero-Miliani wrote:> Thanks for correcting my code Erich, do you think what I wrote is > overcomplicating things? Considering what Juan wrote: > > # Option 1 > a <- 1:4 > b <- c("meep", "foo", "meep", "foo") > d <- data.frame(a, b) > d$r <- with(d, ifelse(b == 'meep', 'oops', 'yay')) > d > > # Option 2 > a <- 1:4 > b <- c("meep", "foo", "meep", "foo") > d <- cbind(a, b) > r <- ifelse(d[,2] == 'meep', 'oops', 'yay') > d <- cbind(d, r) > d > > > On Sun, Apr 18, 2010 at 5:13 PM, Erich Neuwirth > <erich.neuwirth at univie.ac.at> wrote: >>> for(i in seq(along=d[,2])) {if (d[i,2]=="meep") { print("oops")} >>> else { print("yay")} >>> } >> is probably what you want. >> But the way you are using cbind converts a into a vector of >> character. It is not numeric any more. >> Perhaps you want >> >> d<-cbind(as.data.frame(a),b) >> >> And then you could do >> >> for (el in d$b) print(ifelse(el=="meep","oops","yay")) >> >> On 4/18/2010 2:46 PM, Laura Ferrero-Miliani wrote: >>> for(i in seq(along=d[,2])) {if (d[i]=="meep") { print("oops")} >>> else { print("yay")} >>> } >> >> -- >> Erich Neuwirth, University of Vienna >> Faculty of Computer Science >> Computer Supported Didactics Working Group >> Visit our SunSITE at http://sunsite.univie.ac.at >> Phone: +43-1-4277-39464 Fax: +43-1-4277-39459 >> > >-- Erich Neuwirth, University of Vienna Faculty of Computer Science Computer Supported Didactics Working Group Visit our SunSITE at http://sunsite.univie.ac.at Phone: +43-1-4277-39464 Fax: +43-1-4277-39459