David Wyllie
2010-Mar-12 10:33 UTC
[R] Length as fun.aggregate in cast function of reshape package: unexpected error
Dear Everyone, I am having problems with use of the reshape package's cast function using length as an aggregating function. Unexpectedly, I receive the error: 2 arguments passed to 'length' which requires 1 I don't understand this at all - the data I'm using is very simple, and appears almost identical to that used in the ChickWeight example in the package. The problem can be reproduced and is described in more detail below, along with the rationale for what we're trying to do. Any help would be very gratefully received. Best wishes David Wyllie (code starts here) # when using the reshape package to summarise laboratory data, # an unexpected error message occurs then using this fun.aggregate= length. # tested on R2.9.1 and R2.10.0 on XP and Vista # the error message is : # Error in FUN(X[[1L]], ...) : # 2 arguments passed to 'length' which requires 1 # other summary functions, including mean, range, sd work as expected. # I am new at both R and reshape, so there maybe something very basic that # I don't understand here. # in the real situation a data table is pulled using rodbc from an sql server. # the sql server holds the data in a 'long' format - an example is below - # which appears (to me) to be identical to that generated by melt. # no melt operation is therefore carried out on the data, which looks exactly like the # 'testdata' dataframe described below. # testdata contains four numeric biochemistry results (Na,K,Ur,Cr) from a single test, # identified by having rowid=1. # the objective is to compute summaries of this table by a variety of # explanatory variables; in this case, there is only one, 'rowid'. # I had thought that reshape would be an easy and consistent way of doing this, # and so it is, for means, max, min and other standard statistics, # but I can't work out how to count the number of observations in each cell # of the resulting data frame. I had thought that length would do this, but it doesn't. # load the reshape package library(reshape) # define a test data set testdata <- data.frame( rowid = (c(1,1,1,1)), variable = c("Na", "K", "Ur", "Cr"), value = c(130, 4, 5,100) ) # show it testdata # defining rowid as a factor doesn't alter the problem below - am unclear whether this is needed at all by reshape testdata$rowid<-factor(testdata$rowid) # works correctly cast(testdata, variable~., mean, na.rm=TRUE) # works correctly cast(testdata, ~., mean, na.rm=TRUE) # works correctly cast(testdata, ~.,fun.aggregate=mean, na.rm=TRUE) # min, max, sd, range also work as expected # all the below fail with "2 arguments passed to 'length' which requires 1" cast(testdata, variable~., fun.aggregate=length, na.rm=TRUE) cast(testdata, variable~., length, na.rm=TRUE) cast(testdata, ~., length, na.rm=TRUE)
jim holtman
2010-Mar-12 10:36 UTC
[R] Length as fun.aggregate in cast function of reshape package: unexpected error
'length' does not have an argument 'na.rm=TRUE' that you are trying to pass to it. If you want to remove NAs from the 'length' result, you would write your own function usning na.omit:> x <- c(1,2,3,NA,4,5,NA,6) > length(x)[1] 8> length(na.omit(x))[1] 6>cast(testdata, ~., function(x) length(na.omit(X))) On Fri, Mar 12, 2010 at 5:33 AM, David Wyllie <David.Wyllie@ndm.ox.ac.uk>wrote:> Dear Everyone, > > I am having problems with use of the reshape package's cast function using > length as an aggregating function. > Unexpectedly, I receive the error: 2 arguments passed to 'length' which > requires 1 > I don't understand this at all - the data I'm using is very simple, and > appears almost identical to that used in the > ChickWeight example in the package. The problem can be reproduced and is > described in more detail below, along with the rationale for what we're > trying to do. > > Any help would be very gratefully received. > Best wishes > David Wyllie > > (code starts here) > # when using the reshape package to summarise laboratory data, > # an unexpected error message occurs then using this fun.aggregate= length. > # tested on R2.9.1 and R2.10.0 on XP and Vista > > # the error message is : > # Error in FUN(X[[1L]], ...) : > # 2 arguments passed to 'length' which requires 1 > > # other summary functions, including mean, range, sd work as expected. > # I am new at both R and reshape, so there maybe something very basic that > # I don't understand here. > > # in the real situation a data table is pulled using rodbc from an sql > server. > # the sql server holds the data in a 'long' format - an example is below - > # which appears (to me) to be identical to that generated by melt. > # no melt operation is therefore carried out on the data, which looks > exactly like the > # 'testdata' dataframe described below. > > # testdata contains four numeric biochemistry results (Na,K,Ur,Cr) from a > single test, > # identified by having rowid=1. > # the objective is to compute summaries of this table by a variety of > # explanatory variables; in this case, there is only one, 'rowid'. > # I had thought that reshape would be an easy and consistent way of doing > this, > # and so it is, for means, max, min and other standard statistics, > # but I can't work out how to count the number of observations in each cell > # of the resulting data frame. I had thought that length would do this, > but it doesn't. > > # load the reshape package > library(reshape) > > # define a test data set > testdata <- data.frame( > rowid = (c(1,1,1,1)), > variable = c("Na", "K", "Ur", "Cr"), > value = c(130, 4, 5,100) > ) > > # show it > testdata > > # defining rowid as a factor doesn't alter the problem below - am unclear > whether this is needed at all by reshape > testdata$rowid<-factor(testdata$rowid) > > # works correctly > cast(testdata, variable~., mean, na.rm=TRUE) > > # works correctly > cast(testdata, ~., mean, na.rm=TRUE) > > # works correctly > cast(testdata, ~.,fun.aggregate=mean, na.rm=TRUE) > > # min, max, sd, range also work as expected > > # all the below fail with "2 arguments passed to 'length' which requires 1" > cast(testdata, variable~., fun.aggregate=length, na.rm=TRUE) > cast(testdata, variable~., length, na.rm=TRUE) > cast(testdata, ~., length, na.rm=TRUE) > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]]
Apparently Analagous Threads
- Reshaping data from long to wide without a "timevar"
- re sultant column names from reshape::cast, with a fun.aggregate vector
- Identifying records with the correct number of repeated measures
- bug in 'margins' behavior in reshape - cast
- long to wide on larger data set