Sam Steingold
2006-May-11 16:09 UTC
[R] cannot turn some columns in a data frame into factors
Hi, I have a data frame df and a list of names of columns that I want to turn into factors: df.names <- attr(df,"names") sapply(factors, function (name) { pos <- match(name,df.names) if (is.na(pos)) stop(paste(name,": no such column\n")) df[[pos]] <- factor(df[[pos]]) cat(name,"(",pos,"):",is.factor(df[[pos]]),"\n") }) cat("factors:",sapply(df,is.factor),"\n") the output is: Month ( 1 ): TRUE factors: FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE i.e., there is a column named "Month" (the 1st column), and it is indeed turned into a factor inside sapply(), but after that it is numerical again! what am I doing wrong? -- Sam Steingold (http://www.podval.org/~sds) on Fedora Core release 5 (Bordeaux) http://honestreporting.com http://truepeace.org http://openvotingconsortium.org http://thereligionofpeace.com http://memri.org http://palestinefacts.org UNIX, car: hard to learn/easy to use; Windows, bike: hard to learn/hard to use.
jim holtman
2006-May-11 16:27 UTC
[R] cannot turn some columns in a data frame into factors
try '<<-' as the assignment to make it global. df[[pos]] <<- factor(df[[pos]]) On 5/11/06, Sam Steingold <sds@podval.org> wrote:> > Hi, > I have a data frame df and a list of names of columns that I want to > turn into factors: > > df.names <- attr(df,"names") > sapply(factors, function (name) { > pos <- match(name,df.names) > if (is.na(pos)) stop(paste(name,": no such column\n")) > df[[pos]] <- factor(df[[pos]]) > cat(name,"(",pos,"):",is.factor(df[[pos]]),"\n") > }) > cat("factors:",sapply(df,is.factor),"\n") > > the output is: > > > Month ( 1 ): TRUE > factors: FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > > > i.e., there is a column named "Month" (the 1st column), and it is indeed > turned into a factor inside sapply(), but after that it is numerical > again! > > what am I doing wrong? > > -- > Sam Steingold (http://www.podval.org/~sds) on Fedora Core release 5 > (Bordeaux) > http://honestreporting.com http://truepeace.org > http://openvotingconsortium.org > http://thereligionofpeace.com http://memri.org http://palestinefacts.org > UNIX, car: hard to learn/easy to use; Windows, bike: hard to learn/hard to > use. > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >-- Jim Holtman Cincinnati, OH +1 513 646 9390 (Cell) +1 513 247 0281 (Home) What is the problem you are trying to solve? [[alternative HTML version deleted]]
Sam Steingold
2006-May-11 16:32 UTC
[R] cannot turn some columns in a data frame into factors
> * jim holtman <wubygzna at tznvy.pbz> [2006-05-11 12:27:39 -0400]: > > try '<<-' as the assignment to make it global. > > df[[pos]] <<- factor(df[[pos]])nothing changed -- I observe the exact same behaviour: Month ( 1 ): TRUE factors: FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE> On 5/11/06, Sam Steingold <sds at podval.org> wrote: >> >> Hi, >> I have a data frame df and a list of names of columns that I want to >> turn into factors: >> >> df.names <- attr(df,"names") >> sapply(factors, function (name) { >> pos <- match(name,df.names) >> if (is.na(pos)) stop(paste(name,": no such column\n")) >> df[[pos]] <- factor(df[[pos]]) >> cat(name,"(",pos,"):",is.factor(df[[pos]]),"\n") >> }) >> cat("factors:",sapply(df,is.factor),"\n") >> >> the output is: >> >> >> Month ( 1 ): TRUE >> factors: FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE >> >> >> i.e., there is a column named "Month" (the 1st column), and it is indeed >> turned into a factor inside sapply(), but after that it is numerical >> again! >> >> what am I doing wrong?-- Sam Steingold (http://www.podval.org/~sds) on Fedora Core release 5 (Bordeaux) http://pmw.org.il http://ffii.org http://memri.org http://palestinefacts.org http://truepeace.org http://mideasttruth.com http://dhimmi.com If you're being passed on the right, you're in the wrong lane.
Sam Steingold
2006-May-11 19:15 UTC
[R] cannot turn some columns in a data frame into factors
Thanks to everyone who took time to respond, both here on the list and via private e-mail (I do read the list on gmane, so there is not reason to CC me). it turned out that R passes _structured_ arguments by value. the solution I use now is: df[factors] = lapply(df[factors],factor) if (!all(sort(names(df)[sapply(df,is.factor)]) == sort(factors))) stop(paste("bad factors:",sort(names(df)[sapply(df,is.factor)]),"!=", sort(factors))) it is based on a private e-mail reply by Phil Spector.> * Sam Steingold <fqf at cbqiny.bet> [2006-05-11 12:09:26 -0400]: > > I have a data frame df and a list of names of columns that I want to > turn into factors: > > df.names <- attr(df,"names") > sapply(factors, function (name) { > pos <- match(name,df.names) > if (is.na(pos)) stop(paste(name,": no such column\n")) > df[[pos]] <- factor(df[[pos]]) > cat(name,"(",pos,"):",is.factor(df[[pos]]),"\n") > }) > cat("factors:",sapply(df,is.factor),"\n") > > the output is: > > > Month ( 1 ): TRUE > factors: FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > > > i.e., there is a column named "Month" (the 1st column), and it is indeed > turned into a factor inside sapply(), but after that it is numerical > again! > > what am I doing wrong?-- Sam Steingold (http://www.podval.org/~sds) on Fedora Core release 5 (Bordeaux) http://camera.org http://iris.org.il http://dhimmi.com http://memri.org http://ffii.org http://jihadwatch.org http://pmw.org.il PI seconds is a nanocentury