andrewH
2011-Oct-07 07:03 UTC
[R] Unexpected behavior of extract (`[`) or sapply functions
Dear folks-- The function below is a snippet of a larger function that is not doing what it is supposed to do, and I do not understand its behavior. The larger function is supposed to produce an array containing the results of a user-specified function applied to groups of data defined by the intersection of one or more factors, and return them in an array with a dimension for each factor and a dimension level for each factor level. This snippet is supposed to take a data frame, a vector of column numbers containing factors, and a column number for the data, and return (in the test function below, just print) a list of character vectors of the level names (one vector per dimension) and the length of those vectors. It works fine so long as I give it more than one factor column, but if I give it a vector of factor columns of length 1, it behave differently and when I try to assign the names from test.levels to the dimnames of the array, I end up with an error message: Error in dimnames(data) <- dimnames : length of 'dimnames' [1] not equal to array extent The example below shows the function output for a test data frame (?test.df?) when run first of a vector of two column number for factors and then on just one. You can see how the structure of the output shifts. I can not understand what is happening. What I want it to do when given just factor cols =c(1) is to give me back exactly what it gives me bact for factor colum 1 in factor.cols = c(1,2). Any help or suggestions would be greatly appreciated. Sincerely, andrewH # Test Data test.df <- data.frame(AA=rep(LETTERS[1:2], c(6,6)),BB=rep(LETTERS[3:5], c(4,4,4)), CC=rep(LETTERS[6:9],c(3,3,3,3)), DD=c(1:12)) # The function getLevels <- function(data.df, factor.cols, data.col){ test.levels <- sapply(test.df[,factor.cols, drop=F], levels) cat("test.levels:\n"); print(test.levels) no.levels <- sapply(sapply(data.df[,factor.cols, drop=F], levels), length) cat("no.levels:\n"); print(no.levels) } # Run it with two factors and again with 1, Output below cat("\nTest 2 factors:\n") getLevels(test.df, c(1,2), 4) cat("\nTest 1 factor:\n") getLevels(test.df, c(1), 4) Test 2 factors:> getLevels(test.df, c(1,2), 4)test.levels$AA [1] "A" "B" $BB [1] "C" "D" "E" no.levels=AA BB 2 3> cat("\nTest 1 factor:\n")Test 1 factor:> getLevels(test.df, c(1), 4)test.levels= AA [1,] "A" [2,] "B" no.levels=A B 1 1 -- View this message in context: http://r.789695.n4.nabble.com/Unexpected-behavior-of-extract-or-sapply-functions-tp3881176p3881176.html Sent from the R help mailing list archive at Nabble.com.
Petr PIKAL
2011-Oct-07 07:58 UTC
[R] Unexpected behavior of extract (`[`) or sapply functions
Hi Is it necessary to use sapply? With lapply you will get what you want. Regards Petr> > > Dear folks-- > The function below is a snippet of a larger function that is not doingwhat> it is supposed to do, and I do not understand its behavior. The larger > function is supposed to produce an array containing the results of a > user-specified function applied to groups of data defined by the > intersection of one or more factors, and return them in an array with a > dimension for each factor and a dimension level for each factor level.This> snippet is supposed to take a data frame, a vector of column numbers > containing factors, and a column number for the data, and return (in the > test function below, just print) a list of character vectors of thelevel> names (one vector per dimension) and the length of those vectors. > > It works fine so long as I give it more than one factor column, but if I > give it a vector of factor columns of length 1, it behave differentlyand> when I try to assign the names from test.levels to the dimnames of the > array, I end up with an error message: > > Error in dimnames(data) <- dimnames : > length of 'dimnames' [1] not equal to array extent > > The example below shows the function output for a test data frame > (?test.df?) when run first of a vector of two column number for factorsand> then on just one. You can see how the structure of the output shifts. Ican> not understand what is happening. What I want it to do when given just > factor cols =c(1) is to give me back exactly what it gives me bact for > factor colum 1 in factor.cols = c(1,2). > > Any help or suggestions would be greatly appreciated. > > Sincerely, > andrewH > > # Test Data > test.df <- data.frame(AA=rep(LETTERS[1:2], c(6,6)),BB=rep(LETTERS[3:5], > c(4,4,4)), > CC=rep(LETTERS[6:9],c(3,3,3,3)),DD=c(1:12))> > # The function > getLevels <- function(data.df, factor.cols, data.col){ > test.levels <- sapply(test.df[,factor.cols, drop=F], levels) > cat("test.levels:\n"); print(test.levels) > no.levels <- sapply(sapply(data.df[,factor.cols, drop=F], levels),length)> cat("no.levels:\n"); print(no.levels) > } > > # Run it with two factors and again with 1, Output below > cat("\nTest 2 factors:\n") > getLevels(test.df, c(1,2), 4) > cat("\nTest 1 factor:\n") > getLevels(test.df, c(1), 4) > > Test 2 factors: > > getLevels(test.df, c(1,2), 4) > test.levels> $AA > [1] "A" "B" > > $BB > [1] "C" "D" "E" > > no.levels=AA BB > 2 3 > > cat("\nTest 1 factor:\n") > > Test 1 factor: > > getLevels(test.df, c(1), 4) > test.levels= AA > [1,] "A" > [2,] "B" > no.levels=A B > 1 1 > > > -- > View this message in context: http://r.789695.n4.nabble.com/Unexpected- > behavior-of-extract-or-sapply-functions-tp3881176p3881176.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.