R-help, After cogitating for a while, I finally figured out how to define a data.frame column as factor and assign the levels within a function... BUT I still need to pass the data.frame and its name separately. I can't seem to find any other way to pass the name of the data.frame, rather than the data.frame itself. Any suggestions on how to go about it? Is there something like value(object) or name(object) that I can't find? #sample dataframe for this example y <- data.frame( one=c(1,1,3,3,5,7), two=c(2,2,6,6,8,8))> levels(y$one) # check out levelsNULL # the function I've come up with fncFact <- function(datfra, datfraNm){ datfra$one <- factor(datfra$one, levels=c(1,3,5,7,9)) assign(datfraNm, datfra, pos=1) }>fncFact(y, "y") > levels(y$one)[1] "1" "3" "5" "7" "9" I suppose only for aesthetics and simplicity, I'd like to have only pass the data.frame and get the same result. Thanks in advance, Tim Howard> version_ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 0.1 year 2004 month 11 day 15 language R
Wouldn't it be easier to do this?> levels(y$one) <- seq(1, 9, by=2) > y$one[1] 1 1 3 3 5 7 attr(,"levels") [1] 1 3 5 7 9 Andy> From: Tim Howard > > R-help, > After cogitating for a while, I finally figured out how to define a > data.frame column as factor and assign the levels within a function... > BUT I still need to pass the data.frame and its name > separately. I can't > seem to find any other way to pass the name of the data.frame, rather > than the data.frame itself. Any suggestions on how to go > about it? Is > there something like value(object) or name(object) that I can't find? > > #sample dataframe for this example > y <- data.frame( > one=c(1,1,3,3,5,7), > two=c(2,2,6,6,8,8)) > > > levels(y$one) # check out levels > NULL > > # the function I've come up with > fncFact <- function(datfra, datfraNm){ > datfra$one <- factor(datfra$one, levels=c(1,3,5,7,9)) > assign(datfraNm, datfra, pos=1) > } > > >fncFact(y, "y") > > levels(y$one) > [1] "1" "3" "5" "7" "9" > > I suppose only for aesthetics and simplicity, I'd like to have only > pass the data.frame and get the same result. > Thanks in advance, > Tim Howard > > > > version > _ > platform i386-pc-mingw32 > arch i386 > os mingw32 > system i386, mingw32 > status > major 2 > minor 0.1 > year 2004 > month 11 > day 15 > language R > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > >
Andy, Thank you for the help. Yes, my question really did seem like I was going through a lot of unnecessary steps just to define levels of a variable. But that was just for the example. In my application, I bring new datasets into R on a daily basis. While the data differs, the variables are the same, and the categorical variables have the same levels. So I find myself daily applying the same factor and level definitions (by cutting and pasting the large chunk of commands from a text file). It really would be simpler to have it wrapped up in a function. That's why I asked the question about putting this into a function. Upon reading your answer, I thought maybe I could use your example and use the super-assignment '<<-' in the function. But, your method assigns levels, but does not define the var as a factor (interesting!).> levels(y$one) <- seq(1, 9, by=2) > y$one[1] 1 1 3 3 5 7 attr(,"levels") [1] 1 3 5 7 9> is.factor(y$one)[1] FALSE>Unfortunately, whenever I try to use <<- with the dataframe as the variable, I get an error message:> fncFact <- function(datfra){+ datfra$one <<- factor(datfra$one, levels=c(1,3,5,7,9)) + }> fncFact(y)Error in fncFact(y) : Object "datfra" not found>Tim>>> "Liaw, Andy" <andy_liaw at merck.com> 4/20/2005 4:03:24 PM >>>Wouldn't it be easier to do this?> levels(y$one) <- seq(1, 9, by=2) > y$one[1] 1 1 3 3 5 7 attr(,"levels") [1] 1 3 5 7 9 Andy> From: Tim Howard > > R-help, > After cogitating for a while, I finally figured out how to definea> data.frame column as factor and assign the levels within afunction...> BUT I still need to pass the data.frame and its name > separately. I can't > seem to find any other way to pass the name of the data.frame,rather> than the data.frame itself. Any suggestions on how to go > about it? Is > there something like value(object) or name(object) that I can'tfind?> > #sample dataframe for this example > y <- data.frame( > one=c(1,1,3,3,5,7), > two=c(2,2,6,6,8,8)) > > > levels(y$one) # check out levels > NULL > > # the function I've come up with > fncFact <- function(datfra, datfraNm){ > datfra$one <- factor(datfra$one, levels=c(1,3,5,7,9)) > assign(datfraNm, datfra, pos=1) > } > > >fncFact(y, "y") > > levels(y$one) > [1] "1" "3" "5" "7" "9" > > I suppose only for aesthetics and simplicity, I'd like to have only > pass the data.frame and get the same result. > Thanks in advance, > Tim Howard > > > > version > _ > platform i386-pc-mingw32 > arch i386 > os mingw32 > system i386, mingw32 > status > major 2 > minor 0.1 > year 2004 > month 11 > day 15 > language R > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > >------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments,...{{dropped}}
Tim,> From: Tim Howard > > Andy, > Thank you for the help. Yes, my question really did seem like I was > going through a lot of unnecessary steps just to define levels of a > variable. But that was just for the example. In my > application, I bring > new datasets into R on a daily basis. While the data differs, the > variables are the same, and the categorical variables have the same > levels. So I find myself daily applying the same factor and level > definitions (by cutting and pasting the large chunk of commands from a > text file). It really would be simpler to have it wrapped up in a > function. That's why I asked the question about putting this into a > function. > Upon reading your answer, I thought maybe I could use your example > and use the super-assignment '<<-' in the function. But, your method > assigns levels, but does not define the var as a factor > (interesting!). > > > levels(y$one) <- seq(1, 9, by=2) > > y$one > [1] 1 1 3 3 5 7 > attr(,"levels") > [1] 1 3 5 7 9 > > is.factor(y$one) > [1] FALSEOuch! "levels<-" is generic, and the default method simply attach the levels attribute to the object. You need to coerce the object into a factor explicitly.> Unfortunately, whenever I try to use <<- with the dataframe as the > variable, I get an error message: > > > fncFact <- function(datfra){ > + datfra$one <<- factor(datfra$one, levels=c(1,3,5,7,9)) > + } > > fncFact(y) > Error in fncFact(y) : Object "datfra" not foundI believe the canonical ways of doing something like this in R is something along the line of: processData <- function(dat) { dat$f1 <- factor(dat$f1, levels=...) ... ## any other manipulations you want to do dat } Then when you get new data, you just do: newData <- processData(newData) HTH, Andy> > Tim > > >>> "Liaw, Andy" <andy_liaw at merck.com> 4/20/2005 4:03:24 PM >>> > Wouldn't it be easier to do this? > > > levels(y$one) <- seq(1, 9, by=2) > > y$one > [1] 1 1 3 3 5 7 > attr(,"levels") > [1] 1 3 5 7 9 > > Andy > > > From: Tim Howard > > > > R-help, > > After cogitating for a while, I finally figured out how to define > a > > data.frame column as factor and assign the levels within a > function... > > BUT I still need to pass the data.frame and its name > > separately. I can't > > seem to find any other way to pass the name of the data.frame, > rather > > than the data.frame itself. Any suggestions on how to go > > about it? Is > > there something like value(object) or name(object) that I can't > find? > > > > #sample dataframe for this example > > y <- data.frame( > > one=c(1,1,3,3,5,7), > > two=c(2,2,6,6,8,8)) > > > > > levels(y$one) # check out levels > > NULL > > > > # the function I've come up with > > fncFact <- function(datfra, datfraNm){ > > datfra$one <- factor(datfra$one, levels=c(1,3,5,7,9)) > > assign(datfraNm, datfra, pos=1) > > } > > > > >fncFact(y, "y") > > > levels(y$one) > > [1] "1" "3" "5" "7" "9" > > > > I suppose only for aesthetics and simplicity, I'd like to have only > > pass the data.frame and get the same result. > > Thanks in advance, > > Tim Howard > > > > > > > version > > _ > > platform i386-pc-mingw32 > > arch i386 > > os mingw32 > > system i386, mingw32 > > status > > major 2 > > minor 0.1 > > year 2004 > > month 11 > > day 15 > > language R > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > > > > > > > > > -------------------------------------------------------------- > ---------------- > Notice: This e-mail message, together with any attachment...{{dropped}}
Aha! You've just opened the door to another level for this blundering R user. I even went back to my well-used copy of "An Introduction to R" to see where I missed this standard approach for processing new data. Nothing clear but certainly alluded to in many of the function examples. I don't know why I was stuck in that rut. I'm sure 99.9% of you on this list know this, but... To be clear for anyone searching these archives later: Don't bother to ask your function to make assignments to pos=1 (the global environment), just do the assignment yourself when calling the function. For example, instead of coding a function call like this: processData(dat) to assign the processed data to pos=1, simply make the assignment when calling the function: dat <- processData(dat) Thanks for being gentle on me, Andy. Tim>>> "Liaw, Andy" <andy_liaw at merck.com> 4/21/2005 9:57:22 PM >>>Tim,> From: Tim Howard > > Andy, > Thank you for the help. Yes, my question really did seem like Iwas> going through a lot of unnecessary steps just to define levels of a > variable. But that was just for the example. In my > application, I bring > new datasets into R on a daily basis. While the data differs, the > variables are the same, and the categorical variables have the same > levels. So I find myself daily applying the same factor and level > definitions (by cutting and pasting the large chunk of commands froma> text file). It really would be simpler to have it wrapped up in a > function. That's why I asked the question about putting this into a > function. > Upon reading your answer, I thought maybe I could use your example > and use the super-assignment '<<-' in the function. But, your method > assigns levels, but does not define the var as a factor > (interesting!). > > > levels(y$one) <- seq(1, 9, by=2) > > y$one > [1] 1 1 3 3 5 7 > attr(,"levels") > [1] 1 3 5 7 9 > > is.factor(y$one) > [1] FALSEOuch! "levels<-" is generic, and the default method simply attach the levels attribute to the object. You need to coerce the object into a factor explicitly.> Unfortunately, whenever I try to use <<- with the dataframe as the > variable, I get an error message: > > > fncFact <- function(datfra){ > + datfra$one <<- factor(datfra$one, levels=c(1,3,5,7,9)) > + } > > fncFact(y) > Error in fncFact(y) : Object "datfra" not foundI believe the canonical ways of doing something like this in R is something along the line of: processData <- function(dat) { dat$f1 <- factor(dat$f1, levels=...) ... ## any other manipulations you want to do dat } Then when you get new data, you just do: newData <- processData(newData) HTH, Andy> > Tim > > >>> "Liaw, Andy" <andy_liaw at merck.com> 4/20/2005 4:03:24 PM >>> > Wouldn't it be easier to do this? > > > levels(y$one) <- seq(1, 9, by=2) > > y$one > [1] 1 1 3 3 5 7 > attr(,"levels") > [1] 1 3 5 7 9 > > Andy > > > From: Tim Howard > > > > R-help, > > After cogitating for a while, I finally figured out how todefine> a > > data.frame column as factor and assign the levels within a > function... > > BUT I still need to pass the data.frame and its name > > separately. I can't > > seem to find any other way to pass the name of the data.frame, > rather > > than the data.frame itself. Any suggestions on how to go > > about it? Is > > there something like value(object) or name(object) that I can't > find? > > > > #sample dataframe for this example > > y <- data.frame( > > one=c(1,1,3,3,5,7), > > two=c(2,2,6,6,8,8)) > > > > > levels(y$one) # check out levels > > NULL > > > > # the function I've come up with > > fncFact <- function(datfra, datfraNm){ > > datfra$one <- factor(datfra$one, levels=c(1,3,5,7,9)) > > assign(datfraNm, datfra, pos=1) > > } > > > > >fncFact(y, "y") > > > levels(y$one) > > [1] "1" "3" "5" "7" "9" > > > > I suppose only for aesthetics and simplicity, I'd like to haveonly> > pass the data.frame and get the same result. > > Thanks in advance, > > Tim Howard > > > > > > > version > > _ > > platform i386-pc-mingw32 > > arch i386 > > os mingw32 > > system i386, mingw32 > > status > > major 2 > > minor 0.1 > > year 2004 > > month 11 > > day 15 > > language R > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > > > > > > > > > -------------------------------------------------------------- > ---------------- > Notice: This e-mail message, together with any attachments,contains> information of Merck & Co., Inc. (One Merck Drive, WhitehouseStation,> New Jersey, USA 08889), and/or its affiliates (which may be known > outside the United States as Merck Frosst, Merck Sharp & Dohme orMSD> and in Japan, as Banyu) that may be confidential, proprietary > copyrighted and/or legally privileged. It is intended solely > for the use > of the individual or entity named on this message. If you are notthe> intended recipient, and have received this message in error, please > notify us immediately by reply e-mail and then delete it from your > system. > -------------------------------------------------------------- > ---------------- > > >------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments,...{{dropped}}