Sam Steingold
2006-May-11 16:09 UTC
[R] cannot turn some columns in a data frame into factors
Hi,
I have a data frame df and a list of names of columns that I want to
turn into factors:
df.names <- attr(df,"names")
sapply(factors, function (name) {
pos <- match(name,df.names)
if (is.na(pos)) stop(paste(name,": no such column\n"))
df[[pos]] <- factor(df[[pos]])
cat(name,"(",pos,"):",is.factor(df[[pos]]),"\n")
})
cat("factors:",sapply(df,is.factor),"\n")
the output is:
Month ( 1 ): TRUE
factors: FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
i.e., there is a column named "Month" (the 1st column), and it is
indeed
turned into a factor inside sapply(), but after that it is numerical
again!
what am I doing wrong?
--
Sam Steingold (http://www.podval.org/~sds) on Fedora Core release 5 (Bordeaux)
http://honestreporting.com http://truepeace.org http://openvotingconsortium.org
http://thereligionofpeace.com http://memri.org http://palestinefacts.org
UNIX, car: hard to learn/easy to use; Windows, bike: hard to learn/hard to use.
jim holtman
2006-May-11 16:27 UTC
[R] cannot turn some columns in a data frame into factors
try '<<-' as the assignment to make it global.
df[[pos]] <<- factor(df[[pos]])
On 5/11/06, Sam Steingold <sds@podval.org> wrote:>
> Hi,
> I have a data frame df and a list of names of columns that I want to
> turn into factors:
>
> df.names <- attr(df,"names")
> sapply(factors, function (name) {
> pos <- match(name,df.names)
> if (is.na(pos)) stop(paste(name,": no such column\n"))
> df[[pos]] <- factor(df[[pos]])
>
cat(name,"(",pos,"):",is.factor(df[[pos]]),"\n")
> })
> cat("factors:",sapply(df,is.factor),"\n")
>
> the output is:
>
>
> Month ( 1 ): TRUE
> factors: FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>
>
> i.e., there is a column named "Month" (the 1st column), and it is
indeed
> turned into a factor inside sapply(), but after that it is numerical
> again!
>
> what am I doing wrong?
>
> --
> Sam Steingold (http://www.podval.org/~sds) on Fedora Core release 5
> (Bordeaux)
> http://honestreporting.com http://truepeace.org
> http://openvotingconsortium.org
> http://thereligionofpeace.com http://memri.org http://palestinefacts.org
> UNIX, car: hard to learn/easy to use; Windows, bike: hard to learn/hard to
> use.
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390 (Cell)
+1 513 247 0281 (Home)
What is the problem you are trying to solve?
[[alternative HTML version deleted]]
Sam Steingold
2006-May-11 16:32 UTC
[R] cannot turn some columns in a data frame into factors
> * jim holtman <wubygzna at tznvy.pbz> [2006-05-11 12:27:39 -0400]: > > try '<<-' as the assignment to make it global. > > df[[pos]] <<- factor(df[[pos]])nothing changed -- I observe the exact same behaviour: Month ( 1 ): TRUE factors: FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE> On 5/11/06, Sam Steingold <sds at podval.org> wrote: >> >> Hi, >> I have a data frame df and a list of names of columns that I want to >> turn into factors: >> >> df.names <- attr(df,"names") >> sapply(factors, function (name) { >> pos <- match(name,df.names) >> if (is.na(pos)) stop(paste(name,": no such column\n")) >> df[[pos]] <- factor(df[[pos]]) >> cat(name,"(",pos,"):",is.factor(df[[pos]]),"\n") >> }) >> cat("factors:",sapply(df,is.factor),"\n") >> >> the output is: >> >> >> Month ( 1 ): TRUE >> factors: FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE >> >> >> i.e., there is a column named "Month" (the 1st column), and it is indeed >> turned into a factor inside sapply(), but after that it is numerical >> again! >> >> what am I doing wrong?-- Sam Steingold (http://www.podval.org/~sds) on Fedora Core release 5 (Bordeaux) http://pmw.org.il http://ffii.org http://memri.org http://palestinefacts.org http://truepeace.org http://mideasttruth.com http://dhimmi.com If you're being passed on the right, you're in the wrong lane.
Sam Steingold
2006-May-11 19:15 UTC
[R] cannot turn some columns in a data frame into factors
Thanks to everyone who took time to respond, both here on the list and
via private e-mail (I do read the list on gmane, so there is not reason
to CC me).
it turned out that R passes _structured_ arguments by value.
the solution I use now is:
df[factors] = lapply(df[factors],factor)
if (!all(sort(names(df)[sapply(df,is.factor)]) == sort(factors)))
stop(paste("bad
factors:",sort(names(df)[sapply(df,is.factor)]),"!=",
sort(factors)))
it is based on a private e-mail reply by Phil Spector.
> * Sam Steingold <fqf at cbqiny.bet> [2006-05-11 12:09:26 -0400]:
>
> I have a data frame df and a list of names of columns that I want to
> turn into factors:
>
> df.names <- attr(df,"names")
> sapply(factors, function (name) {
> pos <- match(name,df.names)
> if (is.na(pos)) stop(paste(name,": no such column\n"))
> df[[pos]] <- factor(df[[pos]])
>
cat(name,"(",pos,"):",is.factor(df[[pos]]),"\n")
> })
> cat("factors:",sapply(df,is.factor),"\n")
>
> the output is:
>
>
> Month ( 1 ): TRUE
> factors: FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>
>
> i.e., there is a column named "Month" (the 1st column), and it is
indeed
> turned into a factor inside sapply(), but after that it is numerical
> again!
>
> what am I doing wrong?
--
Sam Steingold (http://www.podval.org/~sds) on Fedora Core release 5 (Bordeaux)
http://camera.org http://iris.org.il http://dhimmi.com
http://memri.org http://ffii.org http://jihadwatch.org http://pmw.org.il
PI seconds is a nanocentury