Yohan Chalabi
2008-Dec-03 19:28 UTC
[Rd] reduce limit number of arguments in methods:::cbind
Dear all,
As far as I understand, the number of arguments in methods:::cbind is
limited by the "self recursive" construction of the function
which generates nested loops.
A workaround could be to use the internal cbind function on blocks of
non S4 objects. The limitation would then be reduced to the number of
consecutive S4 objects.
##### R code #####
dfr <- data.frame(matrix(0, nrow = 1 , ncol = 1000))
dfr2 <- is.na(dfr)
mlist <- rep(list(matrix(0, 2, 1)), 400)
cb1 <- do.call("cbind", c(mlist, mlist))
methods:::bind_activation(TRUE)
dfr2 <- is.na(dfr) # fails
cb2 <- do.call("cbind", mlist) # ok
cb3 <- do.call("cbind", c(mlist, mlist)) # fails
# This could be avoided by first checking that the arguments has no S4
# objects. If this is the case, the function falls back to the
# internal cbind function.
# But this would not be very helpful if the arguments are a mixture of
# S4 and non S4 objects
library(Matrix)
Mlist <- rep(list(Matrix(0, 2, 1)), 400)
cb4 <- do.call("cbind", Mlist) # ok
cb5 <- do.call("cbind", c(Mlist, Mlist)) # fails
cb6 <- do.call("cbind", c(Mlist, mlist)) # fails
# A workaround could be to use the internal cbind function on blocks of
# non S4 objects. The limitation would be reduced to the number of
# consecutive S4 objects
# After modifications
dfr2 <- is.na(dfr) # ok
cb7 <- do.call("cbind", mlist) # ok
cb8 <- do.call("cbind", c(mlist, mlist)) # ok
cb9 <- do.call("cbind", c(Mlist, mlist)) # ok
cb10 <- do.call("cbind", c(Mlist, Mlist)) # fails as expected
##### END #####
The code bellow gives an idea how to do it but was not fully tested!
Hope it helps,
Yohan
Index: methods/R/cbind.R
==================================================================---
methods/R/cbind.R (revision 47045)
+++ methods/R/cbind.R (working copy)
@@ -39,11 +39,10 @@
## remove trailing 'NULL's:
while(na > 0 && is.null(argl[[na]])) { argl <- argl[-na]; na
<- na - 1 }
if(na == 0) return(NULL)
- if(na == 1) {
- if(isS4(..1))
- return(cbind2(..1))
- else return(.Internal(cbind(deparse.level, ...)))
- }
+ if (!any(aS4 <- unlist(lapply(argl, isS4))))
+ return(.Internal(cbind(deparse.level, ...)))
+ if(na == 1)
+ return(cbind2(..1))
## else : na >= 2
@@ -64,6 +63,15 @@
else { ## na >= 3 arguments: -- RECURSION -- with care
## determine nrow(<result>) for e.g., cbind(diag(2), 1, 2)
## only when the last two argument have *no* dim attribute:
+ idx.aS4 <- 0
+ while (!rev(aS4)[idx.aS4+1])
+ idx.aS4 <- idx.aS4 + 1
+ if (idx.aS4 > 1) {
+ argl0 <- argl[(na-idx.aS4+1):na]
+ argl1 <- do.call(cbind, c(argl0,
list(deparse.level=deparse.level)))
+ argl2 <- c(argl[1L:(na-idx.aS4)], list(argl1))
+ return(do.call(cbind, c(argl2, list(deparse.level=deparse.level))))
+ }
nrs <- unname(lapply(argl, nrow)) # of length na
iV <- sapply(nrs, is.null)# is 'vector'
fix.na <- identical(nrs[(na-1):na], list(NULL,NULL))
My 2c: The real issue for me is that this approach to handling S4 objects by altering R functions for the worse is incorrect. (by calling bind_activation) m <- matrix(1:2e6L) # 2 million obs> system.time(cbind(m,m))user system elapsed 0.027 0.017 0.044> methods:::bind_activation(TRUE)[1] FALSE # the additional overhead of cbind is now damaging to cbind S3 methods> system.time(cbind(m,m))user system elapsed 0.043 0.034 0.077 [~175% of the original time] Wouldn't a better near-term approach involve writing S3 methods to dispatch on.> methods:::bind_activation(FALSE) > library(Matrix) > M <- Matrix(1:10) > cbind(M,M)M M [1,] ? ?> cbind.dgeMatrix <- function(..., deparse.level=1) methods:::cbind(..., deparse.level=deparse.level) > cbind(M,M)10 x 2 Matrix of class "dgeMatrix" [,1] [,2] [1,] 1 1 [2,] 2 2 [3,] 3 3 [4,] 4 4 [5,] 5 5 [6,] 6 6 [7,] 7 7 [8,] 8 8 [9,] 9 9 [10,] 10 10 # this approach "does no harm" to regular S3 methods> system.time(cbind(m,m))user system elapsed 0.028 0.017 0.045 Obviously this negates part of the S4 dispatch value, but that can be had by calling cbind2 directly. Jeff -- Jeffrey Ryan jeffrey.ryan at insightalgo.com ia: insight algorithmics www.insightalgo.com
Seemingly Similar Threads
- Proper way to define cbind, rbind for s4 classes in package
- Proper way to define cbind, rbind for s4 classes in package
- Proper way to define cbind, rbind for s4 classes in package
- Proper way to define cbind, rbind for s4 classes in package
- methods cbind2 bind_activation disrupts cbind everywhere