On Friday 17 September 2004 13:52, RenE J.V. Bertin wrote:> Hello again, > > I am doing regressions (using panel.lmline() (and panel.abline( > rlm(...))) ) inside a panel method which I pass to bwplot(). > > What I would like to do is create a boxplot of categorised data > (binned on the independent variable), and superpose a regression line > which is calculated using the non-categorised, raw data. I expect > that would give more accurate regression results even if a bwplot > panel used 'world' co-ordinates. > > My initial idea was to do something like > > > bwplot( y~x|cond, panel=bwpanel )Does xyplot( y~x|cond, panel=bwpanel, horizontal = FALSE ) do any better?> > instead of > > > bwplot( y~ Classify(x, binwidth=0.2) | cond, panel=bwpanel ) > > with Classify a function which bins x and returns the result in an > ordered factor, > > and > > bwpanel <- function(x,y, ... ) > { > if( is.factor(x) ){ > panel.bwplot(x,y, ... ) > nx<-as.numeric(x) > } > else{ > nx<-x > x<-Classify(nx, binwidth=0.2 ) > panel.bwplot(x,y, ... ) > } > # add a line showing the means: > panel.linejoin(x,y, fun=function(x) mean(x,na.rm=TRUE), > col="red", lwd=2, ...) > panel.lmline( nx, y, ... ) > # snip > } > > But that doesn't work: bwplot seems to do work on/with x outside of > the panel function which require x to be a factor.Yes, it does. If x and y are both numeric, you should use xyplot. It's perfectly fine to use panel.bwplot as a panel function with xyplot.> I then tried to copy the code from bwplot into a wrapper which would > do the formula parsing, and call bwplot with x replaced by the > properly categorised version, but got stuck along the way. > > I have thus written another wrapper, in which I basically do what I > wanted to do in bwpanel, storing the 'raw' data AND the condition > array in an environment. I then retrieve these inside bwpanel, make > the proper selection using > > bwpanel(x, y, ... ) > { > # snip > xx<- xx[ cond == levels(cond)[ list(...)$panel.counter ] ] > # snip > > }The 'right' way to do this would be bwpanel <- function(x, y, subscripts, ... ) { # snip xx<- xx[subscripts] # snip }> (xx will thus have the current-panel-appropriate subset of the raw > independent data). Then I can do the regression with the raw > observations (y being 'raw'), but now I have to transform the > obtained coefficients so that they are plotted correctly in the > viewport being used (basically, the smallest factor is mapped to 1, > the next to 2, etc). > > I have this working (see http://rjvbertin.free.fr/RegrInBWPlot.pdf; I > can send the code if somebody is interested), but I wonder if > > 1) something like this has not been foreseen already in lattice (in > particular, panel.abline will in general not give correct results > when called from a bwplot panel function!Could you expand on that? Incorrect results in what sense? panel.abline just draws straight lines, it doesn't work with the panel data directly (so, for instance, whether or not x is a factor cannot affect it). For variables that are factors, the scales are set up so that the levels correspond to integers (in other words, as.numeric(x) should give the correct coordinates). Given that, I'm not sure how panel.abline can give incorrect results.> 2) Am I doing the right thing to subset my raw independent value > array -- in particular, there is also a list(...)$panel.number, which > I have only seen having the same value as $panel.counter?As mentioned above, you should probably be using subscripts. panel.number and panel.counter are going to be the same unless you mess with perm.cond and index.cond. Deepayan
I think that for this> kind of purpose, it would be more intuitive if bwplot would > accept numerical x data, together with a 'binning' argument > (like histogram's nint). > > RenEI strongly disagree. That's what cut() and R's functional programming style is for. The number of arguments should be (usually) minimized in favor of expecting users to use the features that R already provides. I think Deepayan (and Bill Cleveland, his forbearer with trellis plots in S-Plus) has done a fine job of doing this. -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA "The business of the statistician is to catalyze the scientific learning process." - George E. P. Box
On Friday 17 September 2004 15:04, RenE J.V. Bertin wrote: [...]> Yes, somebody else pointed that out too. I had seen the argument, > but not seen that it would carry the subscripts to the values being > plotted in a given panel. Now it is obvious, of course ;) BTW: the > xyplot manpage is not very explicit as to what arguments are common > to all functions described: are they all?Yes, pretty much, unless explicitly mentioned otherwise (e.g. for 'box.ratio'). For functions not documented along with xyplot (like cloud, splom, etc), their own help pages will sometimes override the descriptions in ?xyplot.> 8-) > 1) something like this has not been foreseen already in lattice > (in 8-) > particular, panel.abline will in general not give correct > results 8-) > when called from a bwplot panel function! > 8-) > 8-) Could you expand on that? Incorrect results in what sense? > panel.abline 8-) just draws straight lines, it doesn't work with the > panel data directly 8-) (so, for instance, whether or not x is a > factor cannot affect it). For 8-) variables that are factors, the > scales are set up so that the levels 8-) correspond to integers (in > other words, as.numeric(x) should give the 8-) correct coordinates). > Given that, I'm not sure how panel.abline can 8-) give incorrect > results. > > My "in general" above was a little bold. Incorrect results occur > when your factors are in fact numerical categories, as in my case, > where I 'bin' a [0:1] continuous variable into 0.2 wide bins. So I > have factors 0, 0.2, ..., 1 . If I plot the full range, my category 0 > will be mapped onto your plotting co-ordinate 1, and my 1 onto your > 6. If I then plot e.g. a unity line with panel.abline( 0, 1, ...), > the drawn line does not correspond to the labels on the axes. Is it > clear like that? In other words: the co-ordinates I need are > as.numeric(as.character(x)), as my lines are expressed in terms of > the numbers shown along both axes, and not in a frame where the > leftmost point has X==1 and the rightmost X==levels(x)Right, so the problem is not with panel.abline, which is doing what it's supposed to do, but with bwplot forcing one of its arguments from numeric -> factor -> numeric (actually numeric -> shingle -> numeric), in the process changing the numeric values.> In this light, I'm not sure how panel.bwplot() is going to work when > called through xyplot with numerical data: one would have to call it > with factor(x)?. I think that for this kind of purpose, it would be > more intuitive if bwplot would accept numerical x data, together with > a 'binning' argument (like histogram's nint).I see your point, but I don't think it makes sense to have bwplot accept numeric data. bwplot (as well as dotplot, stripplot and barchart) is designed to have a factor (or shingle) as one of it's variables, and in fact that's the main thing that makes it different from xyplot. If you really want both variables to be numeric, you should use xyplot. Unfortunately, as you point out, panel.bwplot doesn't work correctly with xyplot; e.g., the following doesn't work: xyplot(sample(0:5/5, 100, T) ~ rnorm(100), panel = panel.bwplot) This should definitely be fixed, and in fact it is fixed in the pre-release version of 2.0.0, where panel.bwplot draws a boxplot for each unique value of y (or x if horizontal = F). The only thing is that the thickness of the `box'-es are calculated assuming a distance of 1 between consecutive positions, so that has to be explicitly controlled, as in xyplot(sample(0:5/5, 100, T) ~ rnorm(100), panel = panel.bwplot, box.ratio = 0.1) Deepayan