Hi, Seems that on RWin 1.7.0 and 1.6.2 isSeekable returns F on binary files, while seek() works as expected on the same connection - see example below: > con = file(nm, "rb") > isSeekable(con) [1] FALSE > readBin(con, double(), 10) [1] 7.263824e-317 5.968155e-317 2.340685e-317 2.734062e-312 4.088386e-312 4.670335e-317 [7] 6.097545e-317 3.396341e-312 6.615484e-317 1.365171e-312 > readBin(con, double(), 10) [1] 1.303796e-317 5.577835e-317 3.409314e-312 1.303543e-317 3.893617e-317 4.077940e-312 [7] 6.910006e-313 2.694357e-318 4.088373e-312 6.484955e-317 > seek(con, 0, origin="start") [1] 160 > readBin(con, double(), 10) [1] 7.263824e-317 5.968155e-317 2.340685e-317 2.734062e-312 4.088386e-312 4.670335e-317 [7] 6.097545e-317 3.396341e-312 6.615484e-317 1.365171e-312 > Am I doing something silly or is the function returning the wrong value? Regards, Laurens Leerink [[alternate HTML version deleted]]
laurent buffat
2003-May-23 14:00 UTC
[R] replaceMethod time and memory for very large object.
Hi there, First, please apologize, I?m not fluent in English. I try to manipulate very large object with R, and I have some problems with memory and time access, because of the ? by value mechanism ?. I would like to ? encapsulate ? a large vector in a class and access to the vector by method and replaceMethod, but where is a lot of ? implicit copy ?, and so, a lot of memory and time consuming. The data are very large, and come from micro array experiment (see http://Biocondutor.org for more detail of what is a micro array ) , but a typical ? vector is a 20000 genes * 20 probes * 100 experiments * 2 (means and variance) The best way, in term of speed and memory is to try to emulate a ? by reference ? mechanism, but it?s not very ? in the spirit of R ? and a little ? dangerous ? (see the example). Could you give me some recommendations ? Thanks for your help. The code below is a little ? long ?, sorry. Laurent B. //////////////////////////// setClass("Foo", representation(v = "numeric")) setMethod("initialize", signature("Foo"), function(.Object, v=vector()) { .Object at v <- v .Object }) setGeneric("v", function(.Object) standardGeneric("v")) setMethod("v", "Foo", function(.Object) .Object at v ) setGeneric("v<-",function(.Object,value) standardGeneric("v<-")) setReplaceMethod("v", "Foo", function(.Object, value) { .Object at v <- value return(.Object) }) setMethod("[","Foo", function(x,i,j=NA,...,drop=FALSE) x at v[i] ) setReplaceMethod("[","Foo",function(x,i,j=NA,...,value) { x at v[i] <- value x }) n <- 2000 * 20 * 100 * 2 # in fact I would like to have # 20000 genes * 20 mesures by genes (probes) * 100 experiences * 2 ( mean and variance) # but, it's to much memory for these example, so just try with 2000 "genes". x <- rep(1,n) # x, a non encapsuled vetor for the data " y <- new("Foo",v=x) # y, a encapsuled version". x[1] <- 2 y at v[1] <- 2 v(y)[1] <- 2 y[1] <- 2 nt <- 10 # number of test system.time(for(i in 1:nt) x[1] <- 2) system.time(for(i in 1:nt) y at v[1] <- 2) system.time(for(i in 1:nt) v(y)[1] <- 2) system.time(for(i in 1:nt) y[1] <- 2) [1] 0 0 0 0 0 [1] 7.80 3.17 10.97 0.00 0.00 [1] 10.19 5.39 15.60 0.00 0.00 [1] 9.00 4.54 13.55 0.00 0.00 x[1:2] y[1:2] v(y)[1:2] y at v[1:2] system.time(for(i in 1:nt) x[1:2]) system.time(for(i in 1:nt) y[1:2]) system.time(for(i in 1:nt) v(y)[1:2]) system.time(for(i in 1:nt) y at v[1:2]) [1] 0 0 0 0 0 [1] 0 0 0 0 0 [1] 0 0 0 0 0 [1] 0 0 0 0 0 # no problem for "acces method, only for replace method # Class FooPtr, # a way to try to by pass the "by value mecanizim of R" ... setClass("FooPtr", representation(p = "environment")) setMethod("initialize", signature("FooPtr"), function(.Object, v=vector()) { .Object at p <- new("environment") assign("v",v,envir=.Object at p) .Object }) setMethod("v", "FooPtr", function(.Object) get("v",envir=.Object at p) ) setReplaceMethod("v", "FooPtr", function(.Object, value) { assign("v",value,envir=.Object at p) return(.Object) }) setMethod("[","FooPtr", function(x,i,j=NA,...,drop=FALSE) get("v",envir=x at p)[i] ) # a first version of "[<-" for FooPtr : setReplaceMethod("[","FooPtr",function(x,i,j=NA,...,value) { v<- get("v",envir=x at p) v[i] <- value assign("v",v,envir=x at p) x }) z <- new("FooPtr",v=x) x[1] <- 2 v(z)[1] <- 2 z[1] <- 2 system.time(for(i in 1:nt) x[1] <- 2) system.time(for(i in 1:nt) v(z)[1] <- 2) system.time(for(i in 1:nt) z[1] <- 2) [1] 0.01 0.00 0.01 0.00 0.00 [1] 0 0 0 0 0 [1] 1.63 1.18 2.81 0.00 0.00 # the v(z)[1] is "good", but not "[<-" # a more creasy way to try "by reference" setReplaceMethod("[","FooPtr",function(x,i,j=NA,...,value) { assign("i",i,envir=x at p) assign("value",value,envir=x at p) eval(expression(v[i] <- value), envir=x at p) rm("i","value",envir=x at p) x }) system.time(for(i in 1:nt) x[1] <- 2) system.time(for(i in 1:nt) v(z)[1] <- 2) system.time(for(i in 1:nt) z[1] <- 2) [1] 0 0 0 0 0 [1] 0 0 0 0 0 [1] 0.14 0.12 0.26 0.00 0.00 # "[<-" is better, but v(z)[] is the best ... (why ???) # ok, v(z)[i] is the "best" acess, but you need to know what you do : v(z)[1] <- 12345 z1 <- z v(z1)[1] # z and z1 work with the same environment ... ////////////////////// Thanks for your help. Laurent