All, Given a data frame and a list containing factor definitions for certain columns, how can I apply those definitions from the list, rather than doing it the standard way, as noted below. I'm lost in the world of do.call, assign, paste, and can't find my way through. For example: #set up df y <- data.frame(colOne = c(1,2,3), colTwo = c("apple","pear","orange")) factor.defs <- list(colOne = list(name = "colOne", lvl = c(1,2,3,4,5,6)), colTwo = list(name = "colTwo", lvl = c("apple","pear","orange","fig","banana"))) #A standard way to define levels y$colTwo <- factor(y$colTwo , levels = c("apple","pear","orange","fig","banana")) # I'd like to use the definitions locally but also pass them (but not the data) to a function, # so, rather than defining each manually each time, I'd like to loop through the columns, # call them by name, find the definitions in the list and use them from there. Before I try to loop # or use some form of apply, I'd like to get a single factor definition working. # this doesn't seem to see the dataframe properly do.call(factor,list((paste("y$",factor.defs[2][[1]]$name,sep="")),levels=factor.defs[2][[1]]$lvl)) #adding "as.name" doesn't help do.call(factor,list(as.name(paste("y$",factor.defs[2][[1]]$name,sep="")),levels=factor.defs[2][[1]]$lvl)) #Here's my attempt to mimic the standard way, using assign. Ha! what a joke. assign(as.name(paste("y$",factor.defs[2][[1]]$name,sep="")), do.call(factor, list(as.name(paste("y$",factor.defs[2][[1]]$name,sep="")), levels = factor.defs[2][[1]]$lvl))) ##Error in function (x = character(), levels, labels = levels, exclude = NA, : ## object 'y$colTwo' not found Any help or perspective (or better way from the beginning!) would be greatly appreciated. Thanks in advance! Tim [[alternative HTML version deleted]]
On Feb 9, 2011, at 3:44 PM, Tim Howard wrote:> All, > > Given a data frame and a list containing factor definitions for > certain columns, how can I apply those definitions from the list, > rather than doing it the standard way, as noted below. I'm lost in > the world of do.call, assign, paste, and can't find my way through. > For example: > > #set up df > y <- data.frame(colOne = c(1,2,3), colTwo = > c("apple","pear","orange")) > > factor.defs <- list(colOne = list(name = "colOne", > lvl = c(1,2,3,4,5,6)), > colTwo = list(name = "colTwo", > lvl = c("apple","pear","orange","fig","banana"))) > > #A standard way to define levels > y$colTwo <- factor(y$colTwo , levels = > c("apple","pear","orange","fig","banana"))Here's a one item way of using factor.defs. I thought it would be pretty easy to loop through it with lapply or do.call, but it's not immediately obvious once I get down to the nitty gritty. > y[factor.defs[[1]]$name] <- factor(y[[factor.defs[[1]]$name]] , levels= factor.defs[[1]]$lvl) > y colOne colTwo 1 1 apple 2 2 pear 3 3 orange levels(y$colOne) #[1] "1" "2" "3" "4" "5" "6" Note the different uses of "[" and "[[" on each side of the assignment. This works on your example, but I don't think it would leave the non- targeted columns in place y <- as.data.frame( lapply(factor.defs, function(x) { y[[x$name]] <- factor(y[[x$name]] , levels= x$lvl) } ) ) y colOne colTwo 1 1 apple 2 2 pear 3 3 orange I wonder if I could leave out the as.data.frame part and make an assignment in the parent.frame instead? y <- within(y, lapply(factor.defs, function(x) { y[[x$name]] <- factor(y[[x$name]] , levels= x$lvl) } ) ) y colOne colTwo 1 1 apple 2 2 pear 3 3 orange Looks promising. You should construct a more complex test set and report back. -- David.> > # I'd like to use the definitions locally but also pass them (but > not the data) to a function, > # so, rather than defining each manually each time, I'd like to loop > through the columns, > # call them by name, find the definitions in the list and use them > from there. Before I try to loop > # or use some form of apply, I'd like to get a single factor > definition working. > > # this doesn't seem to see the dataframe properly > do.call(factor,list((paste("y$",factor.defs[2][[1]] > $name,sep="")),levels=factor.defs[2][[1]]$lvl)) > > #adding "as.name" doesn't help > do.call(factor,list(as.name(paste("y$",factor.defs[2][[1]] > $name,sep="")),levels=factor.defs[2][[1]]$lvl)) > > #Here's my attempt to mimic the standard way, using assign. Ha! what > a joke. > assign(as.name(paste("y$",factor.defs[2][[1]]$name,sep="")), > do.call(factor, list(as.name(paste("y$",factor.defs[2][[1]] > $name,sep="")), > levels = factor.defs[2][[1]]$lvl))) > ##Error in function (x = character(), levels, labels = levels, > exclude = NA, : > ## object 'y$colTwo' not found > Any help or perspective (or better way from the beginning!) would be > greatly appreciated. > Thanks in advance! > Tim > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT
> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Tim Howard > Sent: Wednesday, February 09, 2011 12:44 PM > To: r-help at r-project.org > Subject: [R] assign factor levels based on list > > All, > > Given a data frame and a list containing factor definitions > for certain columns, how can I apply those definitions from > the list, rather than doing it the standard way, as noted > below. I'm lost in the world of do.call, assign, paste, and > can't find my way through. For example: > > #set up df > y <- data.frame(colOne = c(1,2,3), colTwo = > c("apple","pear","orange")) > > factor.defs <- list(colOne = list(name = "colOne", > lvl = c(1,2,3,4,5,6)), > colTwo = list(name = "colTwo", > lvl = c("apple","pear","orange","fig","banana")))Why not the following format? my.factor.defs <- list(colOne = c(1,2,3,4,5,6), colTwo = c("apple", "pear", "orange", "fig", "banana")) Do you really want to support a case like the following? list(colOne = list( name = "anotherColumn", lvl=c(1,2,3,4,5,6))> #A standard way to define levels > y$colTwo <- factor(y$colTwo , levels = > c("apple","pear","orange","fig","banana")) > > # I'd like to use the definitions locally but also pass them > (but not the data) to a function, > # so, rather than defining each manually each time, I'd like > to loop through the columns, > # call them by name, find the definitions in the list and use > them from there. Before I try to loop > # or use some form of apply, I'd like to get a single factor > definition working.First write a function that takes a data.frame and list of desired levels for each column and outputs a new data.frame. E.g., if you use the simpler form of the levelsList I gave above, the following might work well enough (it does no error checking): assignNewLevelsToDataFrameColumns <- function(x, levelsList) { for(colName in names(levelsList)) { # note that x$name is equivalent to x[["name"]], so # if you want to use a variable as the name, use [[. x[[colName]] <- factor(x[[colName]], levels=levelsList[[colName]]) } x } Test it: > fixedY <- assignNewLevelsToDataFrameColumns(y, my.factor.defs) colOne colTwo 1 1 apple 2 2 pear 3 3 orange > str(fixedY) 'data.frame': 3 obs. of 2 variables: $ colOne: Factor w/ 6 levels "1","2","3","4",..: 1 2 3 $ colTwo: Factor w/ 5 levels "apple","pear",..: 1 2 3 Do > y <- assignNewLevelsToDataFrameColumns(y, my.factor.defs) if you want to overwrite the old y. Now if you want a function that changes the data.frame you give it, use a replacement function. If you want to use the syntax > func(y) <- newStuff then the function should be called `func<-` and the last argument must be called 'value' (newStuff will be passed via value=newStuff). E.g., `func<-` <- function(x, value) { alteredX <- assignNewLevelsToDataFrameColumns(x, value) alteredX } and use it as > func(y) <- my.factor.defs > str(y) 'data.frame': 3 obs. of 2 variables: $ colOne: Factor w/ 6 levels "1","2","3","4",..: 1 2 3 $ colTwo: Factor w/ 5 levels "apple","pear",..: 1 2 3 The first command gets translated into y <- `func<-`(y, value=my.factor.defs) If you write a replacement function, it is nice to create a matching extractor function called 'func'. E.g., > func <- function(x) lapply(x, levels) > func(y) $colOne [1] "1" "2" "3" "4" "5" "6" $colTwo [1] "apple" "pear" "orange" "fig" "banana" Note that this avoids assign(), get(), eval(), etc., and thus makes it easy to follow the flow of data in the code: only things on the left side of the assignment arrow can get changed. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> > # this doesn't seem to see the dataframe properly > do.call(factor,list((paste("y$",factor.defs[2][[1]]$name,sep="")),levels=factor.defs[2][[1]]$lvl))> > #adding "as.name" doesn't help > do.call(factor,list(as.name(paste("y$",factor.defs[2][[1]]$name,sep="")),levels=factor.defs[2][[1]]$lvl))> > #Here's my attempt to mimic the standard way, using assign. > Ha! what a joke. > assign(as.name(paste("y$",factor.defs[2][[1]]$name,sep="")), > do.call(factor, > list(as.name(paste("y$",factor.defs[2][[1]]$name,sep="")), > levels = factor.defs[2][[1]]$lvl))) > ##Error in function (x = character(), levels, labels = > levels, exclude = NA, : > ## object 'y$colTwo' not found > Any help or perspective (or better way from the beginning!) > would be greatly appreciated. > Thanks in advance! > Tim > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >