Hi, Seems that on RWin 1.7.0 and 1.6.2 isSeekable returns F on binary files, while seek() works as expected on the same connection - see example below: > con = file(nm, "rb") > isSeekable(con) [1] FALSE > readBin(con, double(), 10) [1] 7.263824e-317 5.968155e-317 2.340685e-317 2.734062e-312 4.088386e-312 4.670335e-317 [7] 6.097545e-317 3.396341e-312 6.615484e-317 1.365171e-312 > readBin(con, double(), 10) [1] 1.303796e-317 5.577835e-317 3.409314e-312 1.303543e-317 3.893617e-317 4.077940e-312 [7] 6.910006e-313 2.694357e-318 4.088373e-312 6.484955e-317 > seek(con, 0, origin="start") [1] 160 > readBin(con, double(), 10) [1] 7.263824e-317 5.968155e-317 2.340685e-317 2.734062e-312 4.088386e-312 4.670335e-317 [7] 6.097545e-317 3.396341e-312 6.615484e-317 1.365171e-312 > Am I doing something silly or is the function returning the wrong value? Regards, Laurens Leerink [[alternate HTML version deleted]]
laurent buffat
2003-May-23 14:00 UTC
[R] replaceMethod time and memory for very large object.
Hi there,
First, please apologize, I?m not fluent in English.
I try to manipulate very large object with R, and I have some problems with
memory and time access, because of the ? by value mechanism ?.
I would like to ? encapsulate ? a large vector in a class and access to the
vector by method and replaceMethod, but where is a lot of ? implicit copy ?,
and so, a lot of memory and time consuming.
The data are very large, and come from micro array experiment (see
http://Biocondutor.org for more detail of what is a micro array ) , but a
typical ? vector is a 20000 genes * 20 probes * 100 experiments * 2 (means
and variance)
The best way, in term of speed and memory is to try to emulate a ? by
reference ? mechanism, but it?s not very ? in the spirit of R ? and a little
? dangerous ? (see the example).
Could you give me some recommendations ?
Thanks for your help.
The code below is a little ? long ?, sorry.
Laurent B.
////////////////////////////
setClass("Foo", representation(v = "numeric"))
setMethod("initialize", signature("Foo"), function(.Object,
v=vector()) {
.Object at v <- v
.Object
})
setGeneric("v", function(.Object) standardGeneric("v"))
setMethod("v", "Foo", function(.Object) .Object at v )
setGeneric("v<-",function(.Object,value)
standardGeneric("v<-"))
setReplaceMethod("v", "Foo", function(.Object, value) {
.Object at v <- value
return(.Object)
})
setMethod("[","Foo", function(x,i,j=NA,...,drop=FALSE) x at
v[i] )
setReplaceMethod("[","Foo",function(x,i,j=NA,...,value) {
x at v[i] <- value
x
})
n <- 2000 * 20 * 100 * 2
# in fact I would like to have
# 20000 genes * 20 mesures by genes (probes) * 100 experiences * 2 ( mean
and variance)
# but, it's to much memory for these example, so just try with 2000
"genes".
x <- rep(1,n)
# x, a non encapsuled vetor for the data "
y <- new("Foo",v=x)
# y, a encapsuled version".
x[1] <- 2
y at v[1] <- 2
v(y)[1] <- 2
y[1] <- 2
nt <- 10 # number of test
system.time(for(i in 1:nt) x[1] <- 2)
system.time(for(i in 1:nt) y at v[1] <- 2)
system.time(for(i in 1:nt) v(y)[1] <- 2)
system.time(for(i in 1:nt) y[1] <- 2)
[1] 0 0 0 0 0
[1] 7.80 3.17 10.97 0.00 0.00
[1] 10.19 5.39 15.60 0.00 0.00
[1] 9.00 4.54 13.55 0.00 0.00
x[1:2]
y[1:2]
v(y)[1:2]
y at v[1:2]
system.time(for(i in 1:nt) x[1:2])
system.time(for(i in 1:nt) y[1:2])
system.time(for(i in 1:nt) v(y)[1:2])
system.time(for(i in 1:nt) y at v[1:2])
[1] 0 0 0 0 0
[1] 0 0 0 0 0
[1] 0 0 0 0 0
[1] 0 0 0 0 0
# no problem for "acces method, only for replace method
# Class FooPtr,
# a way to try to by pass the "by value mecanizim of R" ...
setClass("FooPtr", representation(p = "environment"))
setMethod("initialize", signature("FooPtr"),
function(.Object, v=vector()) {
.Object at p <- new("environment")
assign("v",v,envir=.Object at p)
.Object
})
setMethod("v", "FooPtr", function(.Object)
get("v",envir=.Object at p) )
setReplaceMethod("v", "FooPtr",
function(.Object, value) {
assign("v",value,envir=.Object at p)
return(.Object)
})
setMethod("[","FooPtr", function(x,i,j=NA,...,drop=FALSE)
get("v",envir=x at p)[i] )
# a first version of "[<-" for FooPtr :
setReplaceMethod("[","FooPtr",function(x,i,j=NA,...,value)
{
v<- get("v",envir=x at p)
v[i] <- value
assign("v",v,envir=x at p)
x
})
z <- new("FooPtr",v=x)
x[1] <- 2
v(z)[1] <- 2
z[1] <- 2
system.time(for(i in 1:nt) x[1] <- 2)
system.time(for(i in 1:nt) v(z)[1] <- 2)
system.time(for(i in 1:nt) z[1] <- 2)
[1] 0.01 0.00 0.01 0.00 0.00
[1] 0 0 0 0 0
[1] 1.63 1.18 2.81 0.00 0.00
# the v(z)[1] is "good", but not "[<-"
# a more creasy way to try "by reference"
setReplaceMethod("[","FooPtr",function(x,i,j=NA,...,value)
{
assign("i",i,envir=x at p)
assign("value",value,envir=x at p)
eval(expression(v[i] <- value), envir=x at p)
rm("i","value",envir=x at p)
x
})
system.time(for(i in 1:nt) x[1] <- 2)
system.time(for(i in 1:nt) v(z)[1] <- 2)
system.time(for(i in 1:nt) z[1] <- 2)
[1] 0 0 0 0 0
[1] 0 0 0 0 0
[1] 0.14 0.12 0.26 0.00 0.00
# "[<-" is better, but v(z)[] is the best ... (why ???)
# ok, v(z)[i] is the "best" acess, but you need to know what you do :
v(z)[1] <- 12345
z1 <- z
v(z1)[1]
# z and z1 work with the same environment ...
//////////////////////
Thanks for your help.
Laurent