Hi, I use sapply very frequently, but I have recently noticed a behavior of sapply which I don't understand and have never seen before. Basically, sapply returns what looks like a matrix, says it a matrix, and appears to let me do matrix things (like transpose). But it is also a list and behaves like a list when I subset it, not a vector (so I can't sort a row for instance). I don't know where this is coming from so as to avoid it, nor how to handle the beast that sapply is returning. I double checked my old version of R and apparently this same thing happens in 1.8.0, though I never experienced it. I had a hard time reproducing it, and I don't know what's setting it off, but the code below seems to do it for me. (I'm using R on Windows XP, either 1.8.0 or 1.9.1) Thanks for any help, Elizabeth Purdom > temp2<-matrix(sample(1:6,6,replace=F),byrow=F,nrow=6,ncol=4) > colnames(temp2)<-paste("A",as.character(1:4),sep="") > temp2<-as.data.frame(temp2) > newtemp2<-sapply((1:6),function(x){xmat<-temp2[temp2[,1]==x,,drop=F];return(xmat[1,])}) > print(newtemp2) #looks like matrix [,1] [,2] [,3] [,4] [,5] [,6] A1 1 2 3 4 5 6 A2 1 2 3 4 5 6 A3 1 2 3 4 5 6 A4 1 2 3 4 5 6 > is.matrix(newtemp2) #says it's matrix [1] TRUE > class(newtemp2) [1] "matrix" > is.list(newtemp2) #but also list [1] TRUE > newtemp2[,1] #can't subset and get a vector back; same thing happens for rows. $A1 [1] 1 $A2 [1] 1 $A3 [1] 1 $A4 [1] 1 #other things about it: > names(newtemp2) NULL > dimnames(newtemp2) [[1]] [1] "A1" "A2" "A3" "A4" [[2]] NULL > dim(newtemp2) [1] 4 6 > length(newtemp2) [1] 24
The problem is that temp2 is a data frame, and the function you are sapply()ing to returns a row from a data frame. A data frame is really a list, with each variable corresponding to a component. If you extract a row of a data frame, you get another data frame, not a vector, even if all variables are the same type. sapply() can really `simplify' the right way if it's given a vector (or matrix). Consider:> str(temp2)`data.frame': 6 obs. of 4 variables: $ A1: int 5 2 4 6 1 3 $ A2: int 5 2 4 6 1 3 $ A3: int 5 2 4 6 1 3 $ A4: int 5 2 4 6 1 3> temp2 <- as.matrix(temp2) > str(temp2)int [1:6, 1:4] 5 2 4 6 1 3 5 2 4 6 ... - attr(*, "dimnames")=List of 2 ..$ : chr [1:6] "1" "2" "3" "4" ... ..$ : chr [1:4] "A1" "A2" "A3" "A4"> str(sapply(1:6,function(x){xmat<-temp2[temp2[,1]==x,,drop=F]; xmat[1,]}))int [1:4, 1:6] 1 1 1 1 2 2 2 2 3 3 ... - attr(*, "dimnames")=List of 2 ..$ : chr [1:4] "A1" "A2" "A3" "A4" ..$ : NULL (The is.matrix() function probably just check whether the dim attribute is a vector of length 2, and not a data frame (as it says in ?is.matrix). The newtemp2 object you get is a list with 24 components, each component is a vector of one integer, and has a dim attribute of c(4, 6). Not what I would call a matrix.) HTH, Andy> From: Elizabeth Purdom > > Hi, > > I use sapply very frequently, but I have recently noticed a > behavior of > sapply which I don't understand and have never seen before. > Basically, > sapply returns what looks like a matrix, says it a matrix, > and appears to > let me do matrix things (like transpose). But it is also a > list and behaves > like a list when I subset it, not a vector (so I can't sort a row for > instance). I don't know where this is coming from so as to > avoid it, nor > how to handle the beast that sapply is returning. I double > checked my old > version of R and apparently this same thing happens in 1.8.0, > though I > never experienced it. I had a hard time reproducing it, and I > don't know > what's setting it off, but the code below seems to do it for > me. (I'm using > R on Windows XP, either 1.8.0 or 1.9.1) > > Thanks for any help, > Elizabeth Purdom > > > > temp2<-matrix(sample(1:6,6,replace=F),byrow=F,nrow=6,ncol=4) > > colnames(temp2)<-paste("A",as.character(1:4),sep="") > > temp2<-as.data.frame(temp2) > > > newtemp2<-sapply((1:6),function(x){xmat<-temp2[temp2[,1]==x,,d > rop=F];return(xmat[1,])}) > > print(newtemp2) #looks like matrix > [,1] [,2] [,3] [,4] [,5] [,6] > A1 1 2 3 4 5 6 > A2 1 2 3 4 5 6 > A3 1 2 3 4 5 6 > A4 1 2 3 4 5 6 > > is.matrix(newtemp2) #says it's matrix > [1] TRUE > > class(newtemp2) > [1] "matrix" > > is.list(newtemp2) #but also list > [1] TRUE > > newtemp2[,1] #can't subset and get a vector back; same > thing happens for > rows. > $A1 > [1] 1 > > $A2 > [1] 1 > > $A3 > [1] 1 > > $A4 > [1] 1 > #other things about it: > > names(newtemp2) > NULL > > dimnames(newtemp2) > [[1]] > [1] "A1" "A2" "A3" "A4" > > [[2]] > NULL > > dim(newtemp2) > [1] 4 6 > > length(newtemp2) > [1] 24 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >
Elizabeth Purdom wrote:> I use sapply very frequently, but I have recently noticed a behavior of > sapply which I don't understand and have never seen before. Basically, > sapply returns what looks like a matrix, says it a matrix, and appears > to let me do matrix things (like transpose). But it is also a list and > behaves like a list when I subset it, not a vector (so I can't sort a > row for instance). I don't know where this is coming from so as to avoid > it, nor how to handle the beast that sapply is returning. I double > checked my old version of R and apparently this same thing happens in > 1.8.0, though I never experienced it. I had a hard time reproducing it, > and I don't know what's setting it off, but the code below seems to do > it for me. (I'm using R on Windows XP, either 1.8.0 or 1.9.1) > > Thanks for any help, > Elizabeth Purdom > > > > temp2<-matrix(sample(1:6,6,replace=F),byrow=F,nrow=6,ncol=4) > > colnames(temp2)<-paste("A",as.character(1:4),sep="") > > temp2<-as.data.frame(temp2)It is this coercion to the data frame that is injecting a list-like property into the result. Try your script without that line and it will work as you expect.> > newtemp2<-sapply((1:6),function(x){xmat<-temp2[temp2[,1]==x,,drop=F];return(xmat[1,])}) > > print(newtemp2) #looks like matrix > [,1] [,2] [,3] [,4] [,5] [,6] > A1 1 2 3 4 5 6 > A2 1 2 3 4 5 6 > A3 1 2 3 4 5 6 > A4 1 2 3 4 5 6The best thing to do in a situation like this is to use the str function to see the details of the structure of the object.
Elizabeth Purdom <epurdom <at> stanford.edu> writes: : : Hi, : : I use sapply very frequently, but I have recently noticed a behavior of : sapply which I don't understand and have never seen before. Basically, : sapply returns what looks like a matrix, says it a matrix, and appears to : let me do matrix things (like transpose). But it is also a list and behaves : like a list when I subset it, not a vector (so I can't sort a row for : instance). I don't know where this is coming from so as to avoid it, nor : how to handle the beast that sapply is returning. I double checked my old : version of R and apparently this same thing happens in 1.8.0, though I : never experienced it. I had a hard time reproducing it, and I don't know : what's setting it off, but the code below seems to do it for me. (I'm using : R on Windows XP, either 1.8.0 or 1.9.1) : : Thanks for any help, : Elizabeth Purdom : : : > temp2<-matrix(sample(1:6,6,replace=F),byrow=F,nrow=6,ncol=4) : > colnames(temp2)<-paste("A",as.character(1:4),sep="") : > temp2<-as.data.frame(temp2) : > : newtemp2<-sapply((1:6),function(x){xmat<-temp2[temp2[,1]==x,,drop=F];return (xmat[1,])}) : > print(newtemp2) #looks like matrix : [,1] [,2] [,3] [,4] [,5] [,6] : A1 1 2 3 4 5 6 : A2 1 2 3 4 5 6 : A3 1 2 3 4 5 6 : A4 1 2 3 4 5 6 : > is.matrix(newtemp2) #says it's matrix : [1] TRUE : > class(newtemp2) : [1] "matrix" : > is.list(newtemp2) #but also list : [1] TRUE : > newtemp2[,1] #can't subset and get a vector back; same thing happens for : rows. : $A1 : [1] 1 : : $A2 : [1] 1 : : $A3 : [1] 1 : : $A4 : [1] 1 : #other things about it: : > names(newtemp2) : NULL : > dimnames(newtemp2) : [[1]] : [1] "A1" "A2" "A3" "A4" : : [[2]] : NULL : > dim(newtemp2) : [1] 4 6 : > length(newtemp2) : [1] 24 The problem is that your function is returning a one row data frame and when sapply tries to simplify the resulting list of 6 data frames that gives a list with dimensions rather what you were expecting which is a vector with dimensions. Let us call the original anonymous function in your post (i.e. the one passed to sapply there), f. We can modify it to produce f2 which is like f except that we wrap the return expression in c() to turn it into a vector: f2 <- function(x){xmat<-temp2[temp2[,1]==x,,drop=F];return(c(xmat[1,]))} sapply(1:6, f2) If you really do want to return a one row data frame then use rbind to bind the data frames together rather than sapply: do.call("rbind", lapply(1:6, f))