Muenchen, Robert A (Bob)
2006-Dec-18 16:10 UTC
[R] Applying variable labels across a data frame
Hi All, I'm working on a class example that demonstrates one way to deal with factors and their labels. I create a function called myLabeler and apply it with lapply. It works on the whole data frame when I subscript it as in lapply( myQFvars[ ,myQFnames ], myLabeler ) but does not work if I leave the [] subscripts off. I would appreciate it if anyone could tell me why. The program below works up until the final two statements. Thanks, Bob # Assigning factor labels to potentially lots of vars. mystring<- ("id,workshop,gender,q1,q2,q3,q4 1,1,f,1,1,5,1 2,2,f,2,1,4,1 3,1,f,2,2,4,3 4,2,f,3,1, ,3 5,1,m,4,5,2,4 6,2,m,5,4,5,5 7,1,m,5,3,4,4 8,2,m,4,5,5,9") mydata<-read.table(textConnection(mystring), header=TRUE,sep=",",row.names="id",na.strings="9") print(mydata) # Create copies of q variables to use as factors # so we can count them. myQlevels <- c(1,2,3,4,5) myQlabels <- c("Strongly Disagree", "Disagree", "Neutral", "Agree", "Strongly Agree") print(myQlevels) print(myQlabels) # Generate two sets of var names to use. myQnames <- paste( "q", 1:4, sep="") myQFnames <- paste( "qf", 1:4, sep="") print(myQnames) #The original names. print(myQFnames) #The names for new factor variables. # Extract the q variables to a separate data frame. myQFvars <- mydata[ ,myQnames] print(myQFvars) # Rename all the variables with F for Factor. colnames(myQFvars) <- myQFnames print(myQFvars) # Create a function to apply the labels to lots of variables. myLabeler <- function(x) { factor(x, myQlevels, myQlabels) } # Here's how to use the function on one variable. summary( myLabeler(myQFvars["qf1"]) ) #Apply it to all the variables. This method works. myQFvars[ ,myQFnames] <- lapply( myQFvars[ ,myQFnames ], myLabeler ) summary(myQFvars) #Here are the results I wanted. # This is the same as above but using the unsubscripted # data frame name. It does not work. myTest <- lapply( myQFvars, myLabeler ) summary(myTest) #I'm not sure what these results are. ======================================================== Bob Muenchen (pronounced Min'-chen), Manager Statistical Consulting Center U of TN Office of Information Technology 200 Stokely Management Center, Knoxville, TN 37996-0520 Voice: (865) 974-5230 FAX: (865) 974-4810 Email: muenchen at utk.edu Web: http://oit.utk.edu/scc, News: http://listserv.utk.edu/archives/statnews.html
You get a list, not a data.frame. Try, as.data.frame(lapply( myQFvars, myLabeler )) Gabriel At 05:10 PM 12/18/2006, Muenchen, Robert A (Bob) wrote:>Hi All, > >I'm working on a class example that demonstrates one way to deal with >factors and their labels. I create a function called myLabeler and apply >it with lapply. It works on the whole data frame when I subscript it as >in lapply( myQFvars[ ,myQFnames ], myLabeler ) but does not work if I >leave the [] subscripts off. I would appreciate it if anyone could tell >me why. The program below works up until the final two statements. > >Thanks, >Bob > > ># Assigning factor labels to potentially lots of vars. > >mystring<- >("id,workshop,gender,q1,q2,q3,q4 > 1,1,f,1,1,5,1 > 2,2,f,2,1,4,1 > 3,1,f,2,2,4,3 > 4,2,f,3,1, ,3 > 5,1,m,4,5,2,4 > 6,2,m,5,4,5,5 > 7,1,m,5,3,4,4 > 8,2,m,4,5,5,9") > >mydata<-read.table(textConnection(mystring), > header=TRUE,sep=",",row.names="id",na.strings="9") >print(mydata) > ># Create copies of q variables to use as factors ># so we can count them. >myQlevels <- c(1,2,3,4,5) >myQlabels <- c("Strongly Disagree", > "Disagree", > "Neutral", > "Agree", > "Strongly Agree") >print(myQlevels) >print(myQlabels) > ># Generate two sets of var names to use. >myQnames <- paste( "q", 1:4, sep="") >myQFnames <- paste( "qf", 1:4, sep="") >print(myQnames) #The original names. >print(myQFnames) #The names for new factor variables. > ># Extract the q variables to a separate data frame. >myQFvars <- mydata[ ,myQnames] >print(myQFvars) > ># Rename all the variables with F for Factor. >colnames(myQFvars) <- myQFnames >print(myQFvars) > ># Create a function to apply the labels to lots of variables. >myLabeler <- function(x) { factor(x, myQlevels, myQlabels) } > ># Here's how to use the function on one variable. >summary( myLabeler(myQFvars["qf1"]) ) > >#Apply it to all the variables. This method works. >myQFvars[ ,myQFnames] <- lapply( myQFvars[ ,myQFnames ], myLabeler ) >summary(myQFvars) #Here are the results I wanted. > ># This is the same as above but using the unsubscripted ># data frame name. It does not work. >myTest <- lapply( myQFvars, myLabeler ) >summary(myTest) #I'm not sure what these results are. > >========================================================> Bob Muenchen (pronounced Min'-chen), Manager > Statistical Consulting Center > U of TN Office of Information Technology > 200 Stokely Management Center, Knoxville, TN 37996-0520 > Voice: (865) 974-5230 > FAX: (865) 974-4810 > Email: muenchen at utk.edu > Web: http://oit.utk.edu/scc, > News: http://listserv.utk.edu/archives/statnews.html > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.--------------------------------------------------------------------- Gabriel Baud-Bovy tel.: (+39) 02 2643 4839 (office) UHSR University (+39) 02 2643 3429 (laboratory) via Olgettina, 58 (+39) 02 2643 4891 (secretary) 20132 Milan, Italy fax: (+39) 02 2643 4892