Lutz Prechelt
2007-Feb-07 18:39 UTC
[Rd] manage R function and data dependencies like 'make'
Dear R-devels, I am looking for a package (or some other infrastructure) for the following situation: I am doing a larger data evaluation. I have several dozen data files and do not want to keep the persistent data in the R workspace (for robustness reasons). I have several dozen R files, some for reading and preprocessing data files, others for doing plots or analyses. I will make frequent changes to both data files and R files while doing the analysis. I would like to automate mechanisms that allow - a data file reading function to suppress its actual work if neither the data file nor the R file containing the function were modified since the data file was last read - an R file sourcing function to suppress its actual work if the R file has not been modified - and perhaps even: automate re-reading a data file upon access to the corresponding dataframe iff the file has been modified since the dataframe was created. In short: Something like Unix's 'make', but for managing dependencies of functions and dataframes in addition to files. In R. (And of course I am very open for solutions that are more elegant than what I have sketched above.) I could not find something in the help and have rather few ideas for good search terms. I any such thing available? (If no such infrastructure exists, what is the right R function for accessing file modification dates?) Thanks! Lutz
Prof Brian Ripley
2007-Feb-07 18:48 UTC
[Rd] manage R function and data dependencies like 'make'
R-devel has file_test() in utils (earlier versions had a private version in tools). That has a '-nt' op to do what you need. file.info() accesses modification dates. Having said that, I would use 'make' as for example R's own test suites do. On Wed, 7 Feb 2007, Lutz Prechelt wrote:> Dear R-devels, > > I am looking for a package (or some other infrastructure) for the > following situation: > > I am doing a larger data evaluation. > I have several dozen data files and do not want to keep the persistent > data in the R workspace (for robustness reasons). > I have several dozen R files, some for reading and preprocessing data > files, others for doing plots or analyses. > I will make frequent changes to both data files and R files while doing > the analysis. > > I would like to automate mechanisms that allow > - a data file reading function to suppress its actual work if neither > the data file nor the R file containing the function were modified since > the data file was last read > - an R file sourcing function to suppress its actual work if the R file > has not been modified > - and perhaps even: automate re-reading a data file upon access to the > corresponding dataframe iff the file has been modified since the > dataframe was created. > > In short: Something like Unix's 'make', but for managing dependencies of > functions and dataframes in addition to files. In R. (And of course I am > very open for solutions that are more elegant than what I have sketched > above.) > > I could not find something in the help and have rather few ideas for > good search terms. > > I any such thing available? > (If no such infrastructure exists, what is the right R function for > accessing file modification dates?) > > Thanks! > > Lutz > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Paul Gilbert
2007-Feb-07 19:41 UTC
[Rd] manage R function and data dependencies like 'make'
I use make to do things like you describe. Here is an example target from one of my Makerules files: $(T:%=compare.R$(REP).T%):compare.%: compare.R estimates.% @echo making $(notdir $(PWD)) $@ because $? changed ... @(cd $(subst compare.,,$@) ; \ echo "z <- try( source('../compare.R')); \ if (!inherits(z, 'try-error')) q('yes', status=0) else \ {print(z); q('yes', status=1)} " | \ R --slave >../$(FLAGS)/$@.log 2>&1) @mv $(FLAGS)/$@.log $(FLAGS)/$@ I realize this is out of context and possibly mangled by mail wrap, but if you are familiar with (GNU) make then it should give you a good idea what to do. In this example I am accumulating (intermediate) things in the .RData file and using flags as targets to indicate the status of different steps. Another possibility would be to rename the .RData files and use them as the targets. Let me know if you want a more complete example. I really would encourage you to think of wrapping R (like a compiler) in make, rather than trying to re-implement something like make within R. (I would be interested to see examples if anyone is using Ant to do this kind of thing.) Paul Lutz Prechelt wrote:> Dear R-devels, > > I am looking for a package (or some other infrastructure) for the > following situation: > > I am doing a larger data evaluation. > I have several dozen data files and do not want to keep the persistent > data in the R workspace (for robustness reasons). > I have several dozen R files, some for reading and preprocessing data > files, others for doing plots or analyses. > I will make frequent changes to both data files and R files while doing > the analysis. > > I would like to automate mechanisms that allow > - a data file reading function to suppress its actual work if neither > the data file nor the R file containing the function were modified since > the data file was last read > - an R file sourcing function to suppress its actual work if the R file > has not been modified > - and perhaps even: automate re-reading a data file upon access to the > corresponding dataframe iff the file has been modified since the > dataframe was created. > > In short: Something like Unix's 'make', but for managing dependencies of > functions and dataframes in addition to files. In R. (And of course I am > very open for solutions that are more elegant than what I have sketched > above.) > > I could not find something in the help and have rather few ideas for > good search terms. > > I any such thing available? > (If no such infrastructure exists, what is the right R function for > accessing file modification dates?) > > Thanks! > > Lutz > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel=================================================================================== La version fran?aise suit le texte anglais. ------------------------------------------------------------------------------------ This email may contain privileged and/or confidential inform...{{dropped}}