Wise and merciful R-helpers:
I want to equip a data frame with an attribute
which specifies how to plot some of the columns.
Up to now we have been doing this by giving the data frame
a `formula' attribute, that can be passed to plot.formula.
For example
dat <- data.frame(x=1:100,y=runif(100),z=100:1)
attr(dat, "plotme") <- (z ~ x)
......
......
if(missing(desiredformula))
desiredformula <- attr(dat, "plotme")
plot(desiredformula, data=dat)
We just got bitten by the fact that a formula object has a `.Environment'
attribute, which may be huge, depending on the environment
in which the formula was created. In the example above there is
no upper limit on the size of the object 'dat' !!!!
That is, environment(attr(dat, "plotme")) could be huge.
What is the recommended/safe way to avoid this?
I don't need the formula to have an environment at all; I'm just using
the formula structure to represent the format of the plot.
It appears that we can't set the environment to NULL;
should we set it to the Global environment e.g. using as.formula?
Or is it wiser to save the formula as a character string
in the object 'dat' and only convert it back to a formula
at the last possible moment?
thanks
Adrian Baddeley
Adrian Baddeley <adrian <at> maths.uwa.edu.au> writes: : I want to equip a data frame with an attribute : which specifies how to plot some of the columns. : : Up to now we have been doing this by giving the data frame : a `formula' attribute, that can be passed to plot.formula. : : For example : dat <- data.frame(x=1:100,y=runif(100),z=100:1) : attr(dat, "plotme") <- (z ~ x) : ...... : ...... : if(missing(desiredformula)) : desiredformula <- attr(dat, "plotme") : plot(desiredformula, data=dat) : : We just got bitten by the fact that a formula object has a `.Environment' : attribute, which may be huge, depending on the environment : in which the formula was created. In the example above there is : no upper limit on the size of the object 'dat' !!!! : That is, environment(attr(dat, "plotme")) could be huge. Do you mean that if fo is the formula then ls(environment(fo)) has many large components? I don't understand why that would be a problem. : : What is the recommended/safe way to avoid this? : I don't need the formula to have an environment at all; I'm just using : the formula structure to represent the format of the plot. : : It appears that we can't set the environment to NULL; : should we set it to the Global environment e.g. using as.formula? It works for me (R 2.1.0 Windows): R> fo <- y~x R> environment(fo) <environment: R_GlobalEnv> R> environment(fo) <- NULL R> environment(fo) NULL : : Or is it wiser to save the formula as a character string : in the object 'dat' and only convert it back to a formula : at the last possible moment?
> From: Martin Maechler <maechler at stat.math.ethz.ch>> maybe Adrian save()s that data.frame susequently? > Then, I assume the environment will copied.> In all(?) other circumstances that should only be a pointer and > not really use much memory.Yes, sorry for the garbled message, it is only when the object containing the formula is save()d that we have a problem with huge file size. A