Hi, i am working with large data frames with many dependend variables. I want to write some functions that will allow me to quickly select variables from the frame and plot them in various colors depending on factor columns, possibly selecting rows according to factor conditions. In order to do this in a nice function, i need to understand how to work with a column name in the body of a function. To simplify my problem, how do i write a function with a body like scatter.plot <- function (data, x, y) { plot(data$x, data$y) } (which doesn't work, of course---i need a series of `substitute's and evals---but i can't get it to work---in Perl it would be something along the lines of $$x, dereferencing $x another time) so that i can call scatter.plot(x, one, two) as a `short' for plot(x$one, x$two) In my real application I want to pass the `column selecting' arguments to `subset', which evan with the code of subset.data.frame I have not been successful after a whole evening... Thanks, -- David A. van Leeuwen < @ElseWare.nl> Echt stijlvol sterven doe je / bij een ander op de mat Op de dag dat je bezorgd wordt / door het NRC Handelsblad ---Joop Visser
"David A. van Leeuwen" <myfirstname at elseware.nl> writes:> Hi, > > i am working with large data frames with many dependend variables. I > want to write some functions that will allow me to quickly select > variables from the frame and plot them in various colors depending on > factor columns, possibly selecting rows according to factor > conditions. In order to do this in a nice function, i need to > understand how to work with a column name in the body of a > function. To simplify my problem, how do i write a function with a > body like > > > scatter.plot <- function (data, x, y) { > plot(data$x, data$y) > }Use plot(data[[x]], data[[y]]) instead -- Douglas Bates bates at stat.wisc.edu Statistics Department 608/262-2598 University of Wisconsin - Madison http://www.stat.wisc.edu/~bates/
Peter Dalgaard wrote:> I think David was > looking for something like > > function(data, x, y) > > eval(substitute(plot(x,y)), data) >This was _exactly_ what i was looking for, after having spent quite some time reading the language definition and some example code such as subset.data.frame. Thanks! So now I've got my function that plots two dependent variables in different coulors and/or point types. It even makes legends, although the positioning isn't great. It can probably do with some code cleaning up, but hey, I'm new. There might even be a function that does this already, but I haven't found it yet. ---david ## Plots colums `xcol' vs. `ycol' in colors accortding to levels `ccol', ## and point types according to `pcol'. ## ## optionally first select rows under condition `cond' plotcol <- function(data, xcol, ycol, ccol=NULL, pcol=NULL, cond=TRUE) { ## extract relevant data s <- eval(substitute(subset(data, cond, c(xcol,ycol,ccol,pcol))), data) ## help to place the legends xrange <- range(s[,1], na.rm=TRUE) yrange <- range(s[,2], na.rm=TRUE) ## colour if (missing(ccol)) col=1 else { ccol <- eval(substitute(ccol), s) # replace name with data col <- as.numeric(ccol) # make colors } ## point type if (missing(pcol)) pch=1 else { pcol <- eval(substitute(pcol), s) pch=as.numeric(pcol) } ## make the plot plot(s[,1:2], col=col, pch=pch) ## and fix the legends if (length(col)>1) { size <- legend(xrange[1], yrange[2], legend=levels(ccol), col=1:nlevels(ccol), pch=1, lty=1) y <- yrange[2]-size$rect$h # calculate pos of next legend } else y <- yrange[2] if (length(pch)>1) legend(xrange[1], y, legend=levels(pcol), pch=1:nlevels(pcol)) }