Wise and merciful R-helpers: I want to equip a data frame with an attribute which specifies how to plot some of the columns. Up to now we have been doing this by giving the data frame a `formula' attribute, that can be passed to plot.formula. For example dat <- data.frame(x=1:100,y=runif(100),z=100:1) attr(dat, "plotme") <- (z ~ x) ...... ...... if(missing(desiredformula)) desiredformula <- attr(dat, "plotme") plot(desiredformula, data=dat) We just got bitten by the fact that a formula object has a `.Environment' attribute, which may be huge, depending on the environment in which the formula was created. In the example above there is no upper limit on the size of the object 'dat' !!!! That is, environment(attr(dat, "plotme")) could be huge. What is the recommended/safe way to avoid this? I don't need the formula to have an environment at all; I'm just using the formula structure to represent the format of the plot. It appears that we can't set the environment to NULL; should we set it to the Global environment e.g. using as.formula? Or is it wiser to save the formula as a character string in the object 'dat' and only convert it back to a formula at the last possible moment? thanks Adrian Baddeley
Adrian Baddeley <adrian <at> maths.uwa.edu.au> writes: : I want to equip a data frame with an attribute : which specifies how to plot some of the columns. : : Up to now we have been doing this by giving the data frame : a `formula' attribute, that can be passed to plot.formula. : : For example : dat <- data.frame(x=1:100,y=runif(100),z=100:1) : attr(dat, "plotme") <- (z ~ x) : ...... : ...... : if(missing(desiredformula)) : desiredformula <- attr(dat, "plotme") : plot(desiredformula, data=dat) : : We just got bitten by the fact that a formula object has a `.Environment' : attribute, which may be huge, depending on the environment : in which the formula was created. In the example above there is : no upper limit on the size of the object 'dat' !!!! : That is, environment(attr(dat, "plotme")) could be huge. Do you mean that if fo is the formula then ls(environment(fo)) has many large components? I don't understand why that would be a problem. : : What is the recommended/safe way to avoid this? : I don't need the formula to have an environment at all; I'm just using : the formula structure to represent the format of the plot. : : It appears that we can't set the environment to NULL; : should we set it to the Global environment e.g. using as.formula? It works for me (R 2.1.0 Windows): R> fo <- y~x R> environment(fo) <environment: R_GlobalEnv> R> environment(fo) <- NULL R> environment(fo) NULL : : Or is it wiser to save the formula as a character string : in the object 'dat' and only convert it back to a formula : at the last possible moment?
> From: Martin Maechler <maechler at stat.math.ethz.ch>> maybe Adrian save()s that data.frame susequently? > Then, I assume the environment will copied.> In all(?) other circumstances that should only be a pointer and > not really use much memory.Yes, sorry for the garbled message, it is only when the object containing the formula is save()d that we have a problem with huge file size. A