Hello! I am using the TeXmacs interface to R. (Though I encountered a similar problem when using Sweave) In doing calculations I often ecounter this scenario: I'll have some calculations in my file: -- A=read.lots.of.data() B=huge.calculation.on(A) C=another.calculation.on(B) -- Now, if A has already been read, I don't need to re-read it. If B has already been calculated, I don't need to recalculate it. But I would like to be able to just press 'enter' on each of them. So, I would like R to somehow figure out dependencies (a bit like in Makefiles) I implemented something like this with the following functions: ---------------------- touch=function(x) {attr(x,"last.updated")=Sys.time();x} last.updated=function(a) { if( length(attr(a,"last.updated")) == 0 ) { Sys.time() } else { attr(a,"last.updated") } } "depends<-"=function(a,value,...) { args=list(...) if( length(attr(a,"last.updated")) == 0 ) { a <- value a <-touch(a) } else { lu=(sapply(args,function(x) last.updated(x)-last.updated(a) > 0 )) if( sum(lu)>0 ) { a <- value a <-touch(a) } } a } ------------------------ Then I can implement what I wanted above as follows: -- if( !exists(A) ) { A=read.lots.of.data(); A=touch(A) } depends(B,A)=huge.calculation.on(A) # this means the assignment 'B=huge.calculation.on(A)' is # done only if A has been updated more recently than B. depends(C,B)=another.calculation.on(B) # dito for C more recent than B. -- And now I can carelessly press 'enter' on these expression that might otherwise take hours to compute. Each variable has a datestamp of the last time it was updated, and I do each calculation conditional on whether certain variables have been recently changed. I can also save A,B,C to a file,later load them, and the calculations will not be redone. But this solution is quite ugly, because of several problems: 1. To call 'depends(A,B)=f(B)' the first time, A has to already exist, otherwise I get an error (before I enter the "depends<-" function.) 2. I would also like to have a convenient way to do "if( !exists(A) ) { A=read.lots.of.data(); A=touch(A) }" maybe something like: depends(A)<-read.lots.of.data() But that doesn't work, because of 1. or A %set% read.lots.data() But that doesn't work, because I haven't figured out a way for a function to change one of its variables. (Maybe I could do A=A %set read.lots.of.data(), but that is really ugly...) 3. It would be nice to be able to do touch(A) instead of A=touch(A) 4. If I modify A without calling 'A=touch(A)', then B will not be updated next time I call 'depends(B,A)=huge.calculation.on(A)'. So it would be nice to have the variable's 'last updated' time updated automatically. (Though then it is a bit problematic to decide what the 'last updated' time should be for variables loaded from a file...) 5. The whole thing is rather cludgy. But I haven't found a good way to implement it. Suggestions? Thanks, Michael
Michael Lachmann <lachmann <at> eva.mpg.de> writes: : : Hello! : : I am using the TeXmacs interface to R. (Though I encountered a similar : problem when using Sweave) : In doing calculations I often ecounter this scenario: I'll have some : calculations in my file: : -- : A=read.lots.of.data() : : B=huge.calculation.on(A) : : C=another.calculation.on(B) : -- : Now, if A has already been read, I don't need to re-read it. If B has : already been calculated, I don't need to recalculate it. But I would : like to be able to just press 'enter' on each of them. : : So, I would like R to somehow figure out dependencies (a bit like in : Makefiles) : : I implemented something like this with the following functions: : ---------------------- : touch=function(x) {attr(x,"last.updated")=Sys.time();x} : : last.updated=function(a) { : if( length(attr(a,"last.updated")) == 0 ) { : Sys.time() : } else { : attr(a,"last.updated") : } : } : : "depends<-"=function(a,value,...) { : args=list(...) : if( length(attr(a,"last.updated")) == 0 ) { : a <- value : a <-touch(a) : } else { : lu=(sapply(args,function(x) last.updated(x)-last.updated(a) > 0 )) : if( sum(lu)>0 ) { : a <- value : a <-touch(a) : } : } : a : } : ------------------------ : Then I can implement what I wanted above as follows: : -- : if( !exists(A) ) { A=read.lots.of.data(); A=touch(A) } : : depends(B,A)=huge.calculation.on(A) : # this means the assignment 'B=huge.calculation.on(A)' is : # done only if A has been updated more recently than B. : : depends(C,B)=another.calculation.on(B) : # dito for C more recent than B. : -- : And now I can carelessly press 'enter' on these expression that might : otherwise take hours to compute. Each variable has a datestamp of the : last time it was updated, and I do each calculation conditional on : whether certain variables have been recently changed. I can also save : A,B,C to a file,later load them, and the calculations will not be redone. : : But this solution is quite ugly, because of several problems: : : 1. To call 'depends(A,B)=f(B)' the first time, A has to already exist, : otherwise I get an error (before I enter the "depends<-" function.) The technique used to implement mulitple return values shown in http://tolstoy.newcastle.edu.au/R/help/04/06/1406.html could be adapted to this problem. Using that technique the code would look like this: depends[A,B] <- f(B) and A would not have to pre-exist. You define a structure with a class of depends, say: depends <- structure(NA, class = "depends") and then define the [<-.depends action on that structure in an analogous way to what was done there. : 2. I would also like to have a convenient way to do : "if( !exists(A) ) { A=read.lots.of.data(); A=touch(A) }" : maybe something like: : depends(A)<-read.lots.of.data() : But that doesn't work, because of 1. : or : A %set% read.lots.data() : But that doesn't work, because I haven't figured out a way for a : function to change one of its variables. : (Maybe I could do A=A %set read.lots.of.data(), but that is really ugly...) Is this what you want? R> f <- function(x,v) assign(as.character(substitute(x)), v, parent.frame()) R> x # x does not exist Error: Object "x" not found R> f(x,3) R> x # now it does [1] 3 <<- can be used if the eval.parent is a third way (see #3 below). : 3. It would be nice to be able to do touch(A) instead of A=touch(A) touch <- function(x) eval.parent(substitute(attr(x,"last.updated")<-Sys.time())) x <- 3 touch(x) : : 4. If I modify A without calling 'A=touch(A)', then B will not be : updated next time I call 'depends(B,A)=huge.calculation.on(A)'. So it : would be nice to have the variable's 'last updated' time updated : automatically. (Though then it is a bit problematic to decide what the : 'last updated' time should be for variables loaded from a file...) If its done in a function you could use on.exit to ensure that it gets updated when leaving the function. : : 5. The whole thing is rather cludgy. But I haven't found a good way to : implement it. : : Suggestions? : : Thanks, : : Michael :
Thank you! Now the conditional assignments work almost prefectly. The current code is at the bottom of the message. Now I can do depends[A,B]=B+9 and this will be executed only if B was more recently updated then A, and if the last time A was updated, the expression used was 'B+9'. And it is also possible to do depends[A,file="table.txt"]=read.table("table.txt") and A will only be read again if the file "table.txt" was updated more recently than it was read last time. I have only a few remaining questions: 1. I added attributes to variables, "last.updated", and "update.expression". If the variables are just regular vectors/lists, then when I print them, I also see these attributes. Is it somehow possible to have these attributes hidden? 2. I defined an operator %set%, which is supposed to work just like "<-" and "=", but also updates "last.updated" and "update.expression". It would be best to really replace "<-" and "=" with this operator, and then any update of a variable will be registered. This doesn't seem to work, because if I define "<-" as function(x,value) { v=as.character(as.expression(substitute(value))) eval.parent(substitute(.Primitive("<-")(x,value))) eval.parent(substitute(attr(x,"last.updated")<-Sys.time())) eval.parent(substitute(attr(x,"update.expression")<-v)) x } then this function calls "attr<-", which then calls "<-" leading to infinite recursion. Is there a way to set an attribute on a variable without using attr<- ? Or maybe it is a bad idea to redefine "<-" in the first place... Thank you very much! Michael Gabor Grothendieck wrote:>Michael Lachmann <lachmann <at> eva.mpg.de> writes: > >: >: Hello! >: >: I am using the TeXmacs interface to R. (Though I encountered a similar >: problem when using Sweave) >: In doing calculations I often ecounter this scenario: I'll have some >: calculations in my file: >: -- >: A=read.lots.of.data() >: >: B=huge.calculation.on(A) >: >: C=another.calculation.on(B) >: -- >: Now, if A has already been read, I don't need to re-read it. If B has >: already been calculated, I don't need to recalculate it. But I would >: like to be able to just press 'enter' on each of them. >: >: So, I would like R to somehow figure out dependencies (a bit like in >: Makefiles) >: >: I implemented something like this with the following functions...>: But this solution is quite ugly, because of several problems: >: >: 1. To call 'depends(A,B)=f(B)' the first time, A has to already exist, >: otherwise I get an error (before I enter the "depends<-" function.) > >The technique used to implement mulitple return values shown in > > http://tolstoy.newcastle.edu.au/R/help/04/06/1406.html > >could be adapted to this problem. Using that technique the code would >look like this: > > depends[A,B] <- f(B) > >and A would not have to pre-exist. > >You define a structure with a class of depends, say: > > depends <- structure(NA, class = "depends") > >and then define the [<-.depends action on that structure >in an analogous way to what was done there. > >: 2. I would also like to have a convenient way to do >: "if( !exists(A) ) { A=read.lots.of.data(); A=touch(A) }" >: maybe something like: >: depends(A)<-read.lots.of.data() >: But that doesn't work, because of 1. >: or >: A %set% read.lots.data() >: But that doesn't work, because I haven't figured out a way for a >: function to change one of its variables. >: (Maybe I could do A=A %set read.lots.of.data(), but that is really ugly...) > > >Is this what you want? > >R> f <- function(x,v) assign(as.character(substitute(x)), v, parent.frame()) >R> x # x does not exist >Error: Object "x" not found >R> f(x,3) >R> x # now it does >[1] 3 > ><<- can be used if the eval.parent is a third way (see #3 below). > >: 3. It would be nice to be able to do touch(A) instead of A=touch(A) > >touch <- function(x) > eval.parent(substitute(attr(x,"last.updated")<-Sys.time())) >x <- 3 >touch(x) > >: >: 4. If I modify A without calling 'A=touch(A)', then B will not be >: updated next time I call 'depends(B,A)=huge.calculation.on(A)'. So it >: would be nice to have the variable's 'last updated' time updated >: automatically. (Though then it is a bit problematic to decide what the >: 'last updated' time should be for variables loaded from a file...) > >If its done in a function you could use on.exit to ensure that it gets >updated when leaving the function. >touch=function(...,l=list()) { args <- as.list(match.call()) if( "l" %in% names(args) ) { argsl=as.list(args[[length(args)]][-1]) argsl=sapply(argsl,function(x) parse(text=x)[[1]]) args=args[-length(args)] } else { argsl=c() } args=c(args[-1],argsl) for( i in 1:length(args)) { eval.parent(substitute(attr(x,"last.updated")<-Sys.time(), list(x=args[[i]]) )) } } last.updated=function(a) { if( length(attr(a,"last.updated")) == 0 ) { Sys.time() } else { attr(a,"last.updated") } } "%set%"=function(x,value) { v=as.character(as.expression(substitute(value))) eval.parent(substitute(.Primitive("<-")(x,value))) eval.parent(substitute(attr(x,"last.updated")<-Sys.time())) eval.parent(substitute(attr(x,"update.expression")<-v)) x } depends <- structure(NA,class="depends") "[<-.depends" <- function(x,...,file=c(),value) { v=as.character(as.expression(substitute(value))) args <- as.list(match.call()) if( !exists(as.character(args[[3]] ), env=sys.frame(-1)) ) { eval(substitute(x<-v,list(x=args[[3]],v=value)),env=sys.frame(-1)) eval(substitute(touch(x),list(x=args[[3]])),env=sys.frame(-1)) eval.parent(substitute(attr(x,"update.expression")<-v,list(x=args[[3]],v=v))) } else { lu=list() xlu=eval(substitute(attr(x,"last.updated"),list(x=args[[3]]),env=sys.frame(-1))) if( length(file) >0 ) { lu=c(lapply(file,function(f) { file.info(f)$mtime }),lu) args=args[-(length(args)-1)] } if( length(xlu)>0 ) { lu=c(lapply((1:(length(args)-1))[-c(1:3)],function(i) { eval(substitute(last.updated(x),list(x=args[[i]])),env=sys.frame(-1)) }),lu) } lu=prod(sapply(lu,function(x) x-xlu<0)) xv=eval(substitute(attr(x,"update.expression"),list(x=args[[3]]),env=sys.frame(-1))) if( length(xv)*length(v) > 0 ) { if( xv != v ) lu = 0 } else { lu=0} if( (lu==0) ) { eval(substitute(x<-v,list(x=args[[3]],v=value)),env=sys.frame(-1)) eval(substitute(touch(x),list(x=args[[3]])),env=sys.frame(-1)) eval.parent(substitute(attr(x,"update.expression")<-v,list(x=args[[3]],v=v))) } } x }
Answers interspersed. Michael Lachmann <lachmann <at> eva.mpg.de> writes: : : Thank you! : Now the conditional assignments work almost prefectly. : The current code is at the bottom of the message. : Now I can do : depends[A,B]=B+9 : and this will be executed only if B was more recently updated then A, and if the last time A was updated, the : expression used was 'B+9'. : : And it is also possible to do : depends[A,file="table.txt"]=read.table("table.txt") : and A will only be read again if the file "table.txt" was updated more recently than it was read last time. : : I have only a few remaining questions: : 1. I added attributes to variables, "last.updated", and "update.expression". If the variables are just : regular vectors/lists, then when I print them, I also see these attributes. Is it somehow possible to have : these attributes hidden? If you assign them to the comment attribute they will be hidden upon printing. comment(x) <- "...whatever..." They must assigned as a character string or a vector of character strings. See ?comment. : : 2. I defined an operator %set%, which is supposed to work just like "<-" and "=", but also updates : "last.updated" and "update.expression". It would be best to really replace "<-" and "=" with this : operator, and then any update of a variable will be registered. This doesn't seem to work, because if I : define "<-" as : function(x,value) : { : v=as.character(as.expression(substitute(value))) : eval.parent(substitute(.Primitive("<-")(x,value))) : eval.parent(substitute(attr(x,"last.updated")<-Sys.time())) : eval.parent(substitute(attr(x,"update.expression")<-v)) : x : } : then this function calls "attr<-", which then calls "<-" leading to infinite recursion. Is there a way to : set an attribute on a variable without using attr<- ? : : Or maybe it is a bad idea to redefine "<-" in the first place... Its probably not a good idea to redefine <-. You could define your own class and then assignment within that class could be defined but, of course, that would only work with variables of your class. You could also define set as a replacement function so that if you did this: set(x) <- ... x would have its time updated assuming you defined your own "set<-" : : Thank you very much! : Michael : : Gabor Grothendieck wrote: : : >Michael Lachmann <lachmann <at> eva.mpg.de> writes: : > : >: : >: Hello! : >: : >: I am using the TeXmacs interface to R. (Though I encountered a similar : >: problem when using Sweave) : >: In doing calculations I often ecounter this scenario: I'll have some : >: calculations in my file: : >: -- : >: A=read.lots.of.data() : >: : >: B=huge.calculation.on(A) : >: : >: C=another.calculation.on(B) : >: -- : >: Now, if A has already been read, I don't need to re-read it. If B has : >: already been calculated, I don't need to recalculate it. But I would : >: like to be able to just press 'enter' on each of them. : >: : >: So, I would like R to somehow figure out dependencies (a bit like in : >: Makefiles) : >: : >: I implemented something like this with the following functions : : ... : : >: But this solution is quite ugly, because of several problems: : >: : >: 1. To call 'depends(A,B)=f(B)' the first time, A has to already exist, : >: otherwise I get an error (before I enter the "depends<-" function.) : > : >The technique used to implement mulitple return values shown in : > : > http://tolstoy.newcastle.edu.au/R/help/04/06/1406.html : > : >could be adapted to this problem. Using that technique the code would : >look like this: : > : > depends[A,B] <- f(B) : > : >and A would not have to pre-exist. : > : >You define a structure with a class of depends, say: : > : > depends <- structure(NA, class = "depends") : > : >and then define the [<-.depends action on that structure : >in an analogous way to what was done there. : > : >: 2. I would also like to have a convenient way to do : >: "if( !exists(A) ) { A=read.lots.of.data(); A=touch(A) }" : >: maybe something like: : >: depends(A)<-read.lots.of.data() : >: But that doesn't work, because of 1. : >: or : >: A %set% read.lots.data() : >: But that doesn't work, because I haven't figured out a way for a : >: function to change one of its variables. : >: (Maybe I could do A=A %set read.lots.of.data(), but that is really ugly...) : > : > : >Is this what you want? : > : >R> f <- function(x,v) assign(as.character(substitute(x)), v, parent.frame()) : >R> x # x does not exist : >Error: Object "x" not found : >R> f(x,3) : >R> x # now it does : >[1] 3 : > : ><<- can be used if the eval.parent is a third way (see #3 below). : > : >: 3. It would be nice to be able to do touch(A) instead of A=touch(A) : > : >touch <- function(x) : > eval.parent(substitute(attr(x,"last.updated")<-Sys.time())) : >x <- 3 : >touch(x) : > : >: : >: 4. If I modify A without calling 'A=touch(A)', then B will not be : >: updated next time I call 'depends(B,A)=huge.calculation.on(A)'. So it : >: would be nice to have the variable's 'last updated' time updated : >: automatically. (Though then it is a bit problematic to decide what the : >: 'last updated' time should be for variables loaded from a file...) : > : >If its done in a function you could use on.exit to ensure that it gets : >updated when leaving the function. : > : : touch=function(...,l=list()) : { : args <- as.list(match.call()) : if( "l" %in% names(args) ) { : argsl=as.list(args[[length(args)]][-1]) : argsl=sapply(argsl,function(x) parse(text=x)[[1]]) : args=args[-length(args)] : } else { argsl=c() } : args=c(args[-1],argsl) : for( i in 1:length(args)) { : eval.parent(substitute(attr(x,"last.updated")<-Sys.time(), : list(x=args[[i]]) )) : } : } : : last.updated=function(a) { : if( length(attr(a,"last.updated")) == 0 ) { : Sys.time() : } else { : attr(a,"last.updated") : } : } : : "%set%"=function(x,value) : { : v=as.character(as.expression(substitute(value))) : eval.parent(substitute(.Primitive("<-")(x,value))) : eval.parent(substitute(attr(x,"last.updated")<-Sys.time())) : eval.parent(substitute(attr(x,"update.expression")<-v)) : x : } : : depends <- structure(NA,class="depends") : : "[<-.depends" <- function(x,...,file=c(),value) { : v=as.character(as.expression(substitute(value))) : args <- as.list(match.call()) : if( !exists(as.character(args[[3]] ), env=sys.frame(-1)) ) { : eval(substitute(x<-v,list(x=args[[3]],v=value)),env=sys.frame(-1)) : eval(substitute(touch(x),list(x=args[[3]])),env=sys.frame(-1)) : eval.parent(substitute(attr(x,"update.expression")<-v,list(x=args[[3]],v=v))) : } else { : lu=list() : xlu=eval(substitute(attr(x,"last.updated"),list(x=args[[3]]),env=sys.frame(-1))) : if( length(file) >0 ) { : lu=c(lapply(file,function(f) { : file.info(f)$mtime : }),lu) : args=args[-(length(args)-1)] : } : if( length(xlu)>0 ) { : lu=c(lapply((1:(length(args)-1))[-c(1:3)],function(i) { : eval(substitute(last.updated(x),list(x=args[[i]])),env=sys.frame(-1)) : }),lu) : } : lu=prod(sapply(lu,function(x) x-xlu<0)) : xv=eval(substitute(attr(x,"update.expression"),list(x=args[[3]]),env=sys.frame(-1))) : if( length(xv)*length(v) > 0 ) { : if( xv != v ) lu = 0 : } else { lu=0} : if( (lu==0) ) { : eval(substitute(x<-v,list(x=args[[3]],v=value)),env=sys.frame(-1)) : eval(substitute(touch(x),list(x=args[[3]])),env=sys.frame(-1)) : eval.parent(substitute(attr(x,"update.expression")<-v,list(x=args[[3]],v=v))) : } : } : x : } :