andrewH
2011-Oct-07 07:03 UTC
[R] Unexpected behavior of extract (`[`) or sapply functions
Dear folks--
The function below is a snippet of a larger function that is not doing what
it is supposed to do, and I do not understand its behavior. The larger
function is supposed to produce an array containing the results of a
user-specified function applied to groups of data defined by the
intersection of one or more factors, and return them in an array with a
dimension for each factor and a dimension level for each factor level. This
snippet is supposed to take a data frame, a vector of column numbers
containing factors, and a column number for the data, and return (in the
test function below, just print) a list of character vectors of the level
names (one vector per dimension) and the length of those vectors.
It works fine so long as I give it more than one factor column, but if I
give it a vector of factor columns of length 1, it behave differently and
when I try to assign the names from test.levels to the dimnames of the
array, I end up with an error message:
Error in dimnames(data) <- dimnames :
length of 'dimnames' [1] not equal to array extent
The example below shows the function output for a test data frame
(?test.df?) when run first of a vector of two column number for factors and
then on just one. You can see how the structure of the output shifts. I can
not understand what is happening. What I want it to do when given just
factor cols =c(1) is to give me back exactly what it gives me bact for
factor colum 1 in factor.cols = c(1,2).
Any help or suggestions would be greatly appreciated.
Sincerely,
andrewH
# Test Data
test.df <- data.frame(AA=rep(LETTERS[1:2], c(6,6)),BB=rep(LETTERS[3:5],
c(4,4,4)),
CC=rep(LETTERS[6:9],c(3,3,3,3)), DD=c(1:12))
# The function
getLevels <- function(data.df, factor.cols, data.col){
test.levels <- sapply(test.df[,factor.cols, drop=F], levels)
cat("test.levels:\n"); print(test.levels)
no.levels <- sapply(sapply(data.df[,factor.cols, drop=F], levels), length)
cat("no.levels:\n"); print(no.levels)
}
# Run it with two factors and again with 1, Output below
cat("\nTest 2 factors:\n")
getLevels(test.df, c(1,2), 4)
cat("\nTest 1 factor:\n")
getLevels(test.df, c(1), 4)
Test 2 factors:> getLevels(test.df, c(1,2), 4)
test.levels$AA
[1] "A" "B"
$BB
[1] "C" "D" "E"
no.levels=AA BB
2 3 > cat("\nTest 1 factor:\n")
Test 1 factor:> getLevels(test.df, c(1), 4)
test.levels= AA
[1,] "A"
[2,] "B"
no.levels=A B
1 1
--
View this message in context:
http://r.789695.n4.nabble.com/Unexpected-behavior-of-extract-or-sapply-functions-tp3881176p3881176.html
Sent from the R help mailing list archive at Nabble.com.
Petr PIKAL
2011-Oct-07 07:58 UTC
[R] Unexpected behavior of extract (`[`) or sapply functions
Hi Is it necessary to use sapply? With lapply you will get what you want. Regards Petr> > > Dear folks-- > The function below is a snippet of a larger function that is not doingwhat> it is supposed to do, and I do not understand its behavior. The larger > function is supposed to produce an array containing the results of a > user-specified function applied to groups of data defined by the > intersection of one or more factors, and return them in an array with a > dimension for each factor and a dimension level for each factor level.This> snippet is supposed to take a data frame, a vector of column numbers > containing factors, and a column number for the data, and return (in the > test function below, just print) a list of character vectors of thelevel> names (one vector per dimension) and the length of those vectors. > > It works fine so long as I give it more than one factor column, but if I > give it a vector of factor columns of length 1, it behave differentlyand> when I try to assign the names from test.levels to the dimnames of the > array, I end up with an error message: > > Error in dimnames(data) <- dimnames : > length of 'dimnames' [1] not equal to array extent > > The example below shows the function output for a test data frame > (?test.df?) when run first of a vector of two column number for factorsand> then on just one. You can see how the structure of the output shifts. Ican> not understand what is happening. What I want it to do when given just > factor cols =c(1) is to give me back exactly what it gives me bact for > factor colum 1 in factor.cols = c(1,2). > > Any help or suggestions would be greatly appreciated. > > Sincerely, > andrewH > > # Test Data > test.df <- data.frame(AA=rep(LETTERS[1:2], c(6,6)),BB=rep(LETTERS[3:5], > c(4,4,4)), > CC=rep(LETTERS[6:9],c(3,3,3,3)),DD=c(1:12))> > # The function > getLevels <- function(data.df, factor.cols, data.col){ > test.levels <- sapply(test.df[,factor.cols, drop=F], levels) > cat("test.levels:\n"); print(test.levels) > no.levels <- sapply(sapply(data.df[,factor.cols, drop=F], levels),length)> cat("no.levels:\n"); print(no.levels) > } > > # Run it with two factors and again with 1, Output below > cat("\nTest 2 factors:\n") > getLevels(test.df, c(1,2), 4) > cat("\nTest 1 factor:\n") > getLevels(test.df, c(1), 4) > > Test 2 factors: > > getLevels(test.df, c(1,2), 4) > test.levels> $AA > [1] "A" "B" > > $BB > [1] "C" "D" "E" > > no.levels=AA BB > 2 3 > > cat("\nTest 1 factor:\n") > > Test 1 factor: > > getLevels(test.df, c(1), 4) > test.levels= AA > [1,] "A" > [2,] "B" > no.levels=A B > 1 1 > > > -- > View this message in context: http://r.789695.n4.nabble.com/Unexpected- > behavior-of-extract-or-sapply-functions-tp3881176p3881176.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.