Some time ago I was allowed to discuss "Diamond Graphs", and whether they would be useful in R, in this mailing list. The August 2003 issue of The American Statistician has finally arrived here and I have been able to read the article. A number of points of interest arise. 1. The article is "A Diamond-Shaped Equiponderant Graphical Display of the Effects of Two Categorical Predictors on Continuous Outcomes" by Xiuhong Li, Jennier M. Buechner, Patrike M. Tarwater, and Alvaro M\~unoz. The American Statistician, August 2003, pages 193-199. (All of the family names are displayed in small caps except for Xiuhong Li's. Does anyone know why?) 2. There are three examples in the paper. a. Figures 2 and 3 display likelihood of developing AIDS as the thing to be explained, with plasma HIV-RNA level (measured in copies per ml) and degree of immune deficiency (measured in count of CD4+ T-lymphocyte cells per cubic mm) as the explanatory variables. The explanatory variables are continuous, not categorical. If the raw data were used, it would be possible to estimate a 2d probability density and display that. The variables have been made categorical by cutting to 5 levels each. I managed to get a copy of the article this is based on. It certainly isn't clear from the diamond graph that the viral load categories were approximately quartiles, with the bottom quartile split in two. So two of the viral load categories have less data than the other three. Much the same happened with the CD4+ levels, which is also not apparent. Using a diamond graph instead of a density plot with rugs on the margins hides the amount of data available for estimating the cells (6 of the 25 cells are empty because there is missing data). Viral load and CD4+ count were respectively the best and second best predictor out of five predictors (seven are listed at one point; I guess this means that CD3+ and CD8+ levels weren't useful at all). I have skimmed the article a couple of times, and cannot figure out why just two predictors were chosen. It would be of interest to see graphs for one predictor, two predictors, and three predictors. I have not yet seen any diamond graphs with three explanatory variables... b. Figures 4-6 display (age-adjusted rate of end-stage renal disease due to any cause per 100,000 person-years) as the thing to be explained, with systolic blood pressure (measured in mm Hg) and diastolic pressure (measured in mm Hg) as the explanatory variables. Once again the explanatory variables are continuous, not categorical. They are cut to 6 levels each. With the raw data, one could perhaps get a contour plot of fitted disease rate and a scatterplot of the explanatory variables on the same graph. 4 of the 36 cells are empty, but in this case the values in the cells basically _are_ counts, so we are _not_ left wondering how much data each cell is based on. I have not yet seen the article this was based on. c. Figure 7 has two graphs. On the left it's relative risk of breast cancer as the thing to be explained, with adult weight change (measured in kg; why not as a proportion of starting weight?) and hormone use (never, past, current) as the explanatory variables. On the right excess risk is to be explained, with the same explanatory variables. One of the explanatory variables is continuous. (Although it is not obvious to me that a weight change of 10 kg in a 55kg woman should have the same significance as a weight change of 10kg in a 75kg woman.) It strikes me that the other explanatory variable may well be an approximation to a continuous predictor also (some kind of exponentially weighted dose, perhaps). I have not yet seen the article this was based on. I have seen the abstract, though, which draws a conclusion somewhat at odds with the apparent significance of these graphs. I expect that this shows that the diamond graphs _are_ useful. Like "a", we get no idea of how much data each cell is based on. In no case were there really two categorical predictors to start with. 3. I finally pinned down what these graphs remind me of: the two-way plots described in Tukey's 1977 book "Exploratory Data Analysis", which is not cited in the article. Tukey's basic idea goes like this: (1) Fit an additive model to the data (median polish, whatever). (2) Tilt and spread the axes so that the vertical dimension is the fitted values. (3) Draw lines, not boxes, so that the fitted value for X=m Y=n is the level at the intersection of the lines for X=m and Y=n. (4) Now that you can see what the fitted values are from the intersections, plot the residuals, either as sticks from the intersection to the true value, or as variously sized/shaped blobs to show the relative magnitudes of the residuals. Once I realised this, I realised what really bothered me about these graphs. They simply summarise the raw data (crudely). There is no "data = fit + residuals". I found myself _itching_ for the raw data so that I could see what was really going on. 4. The paper compares a diamond graph (figure 5) with a trellis graph, or rather, a pair of trellis graphs (figure 6) for the same data. I felt much more comfortable with the trellis graph, largely because the numbers (in the range 0..200ish) were well spread out. The trellis graph was, however, much bigger, and is less immediately accessible; the diamond graph conveys an impression of understanding without needing a lot of explanation. 5. My analysis of perceptual issues was right in some respects and wrong in others. The hexagons have been very carefully designed so that - the area is proportional to p - one length is proportional to p - the difference of two other lengths is proportional to p where p is the value in the interval [0,1] which is to be presented. I must say that for me the visually most salient length is one which is _not_ proportional to p (it's (1+p)/2). 6. The article neither presents nor cites any experimental data to show that any task can be completed faster or more accurately using diamond graphs than some other kind of display. As yet, it's a matter of opinion. 7. It is unfortunate that the name "diamond graph" was chosen; "diamond graph" is an established technical term in mathematics. You could get much the same effect by plotting discs of varying sizes, or sectors of varying width, or little thermometers, or practically anything varying in area, on a standard horizontal & vertical table, and then rotating the paper by hand. You won't be able to estimate sizes accurately, but thanks to the cramped range available you aren't going to estimate sizes accurately from a diamond graph anyway (unless you can read the number displayed in the centre, which in my photocopy of the article I can't). In short, diamond graphs offer a reasonably clear way to summarise some kinds of data, particularly for non-statisticians, but neither express nor lead to any kind of analysis. Since R is a statistics package rather than a "business presentation graphics" package, perhaps Tukey-style two-way plots (are they already available somewhere?) would be more useful than diamond graphs.
I have now obtained copies of all three medical papers that the "Diamond Graphs" article based its examples on. Figures 4 and 5: "Blood Pressure and End-Stage Renal Disease in Men". The two predictor variables (systolic and disastolic blood pressure) are not only continuous, they are correlated. Recoding as some kind of "size" (c1.diastolic + c2.systolic) and "shape" (maybe log(systolic/diastolic) might have been interesting. The real summary that I think anyone reading that paper would rely on is not the 3d bar chart (figure 2) but a table (table 3) which relates blood pressure category (optimal, normal, high-normal, stage 1/2/3/4 hypertension) to adjusted relative risk (with 95% confidence interval). Reading the article, other (listed) factors also affected relative risk, and it could have been useful to present some kind of multi- dimensional table. Comparing the original 3d bar plot and table with the diamond graph, two things stand out: (a) the higher the bar (= the bigger the hexagon), the *less* the amount of data it is based on. This can be seen very clearly in the table; it cannot be seen at all in either the bar plot or the diamond graph. If I'm reading the article correctly (hard, because the table and 3d bar plot don't use exactly the same categories), the lowest bar is based on 40 times as muh data as the highest bar (and the relative risk has a suitably wide confidence interval). (b) one would expect the risk to increase monotonically with each predictor. It doesn't. This stands out very clearly in the 3d bar plot. It is very hard to see at all in the diamond graph. Once I saw it in the 3d plot, I could (just) detect it in the diamond graph, but the diamond graph would never have called my attention to it. In fairness to both the 3d bar plot and the diamond graph, they _could_ be made to show an equivalent of error bars. Let the bar (or hexagon) be coloured black from 0 to the lower end-point of the confidence interval, then red (if colour is desired) or grey (if it is not) from the lower end-point of the confidence interval to the upper end-point, with a "black belt" at the nominal value. [Oh DRAT! I could have patented that extension to diamond graphs! Tsk tsk. I'll never get rich, I'll always be 'ard up.] Figure 7: "Dual Effects of Weight and Weight Gain on Breast Cancer Risk". In my previous message, I commented that I found it hard to believe that weight *change* should be considered alone. It wasn't. In fact, that's part of the point of the article. I also commented that it seemed to me that the one categorical predictor was probably a surrogate for a continuous variable. Imagine me slapping my head and saying "but I _knew_ that!" The explanation is in the editor's comment, not cited in the Diamond Graphs paper, so here it is: Editorial, "Weight and Risk for Breast Cancer", Jennifer L. Kelsey & John Baron, JAMA, November 5, 19997--Vol 278, No.17 The point is that weight and hormone treatment are *both* surrogates for "lifetime estrogen dose profile". In post-menopausal women, female hormones _are_ still produced, in fat (which is an active tissue). I _knew_ that. So in fact there is a _single_ explanatory variable (some kind of weighted cumulative exposure) which both hormone therapy and body mass index affect. This raises the obvious point that adding a third predictor (typical hormone levels during years of fertility) might well be very informative. But how would diamond graphs cope with that? But wait: the abstract says "Higher [body mass index] was associated with LOWER breast cancer incidence before menopause" but a "positive relationship was seen among postmenopausal women who had never used hormone replacement". It also says "Weight gain after the age of 18 years was UNRELATED to breast cancer incidence before menopause but was POSITIVELY associated with indicence after menopause". The editorial cited above makes this point also. That is, in order to see the results of that study, you need a display which - shows weight - shows weight change - shows hormone therapy use - distinguishes between breast cancer before menopause and breast cancer after menopause. The first sentence in the body of the paper is "The relation of body weight to breast cancer is complex." If there is an easy way to produce an "equiponderant display" with three predictors on a two-dimensional piece of paper, I do not know what it may be. It's certain that diamond graphs, as described in the TAS article, cannot do justice to the data from this study. In contrast, the tables in the paper made the difference between pre- and post-menopausal outcomes clear, and above all, included confidence intervals. Why do the confidence intervals matter? Well, table 2 of the paper shows that the "multivariate-adjusted relative risk" confidence intervals for premenopausal women all contain 1 (with a fairly high p for trend), so there _might_ not, on this evidence, be any effect at all, while the relative risk confidence intervals for postmenopausal women all contain 1 except for gains of 20kg or more (where the relative risk could be as low as 1.2). Since the study was based on 1000 premenopausal women and 1517 postmenopausal ones, while the effect is biologically plausible, it doesn't appear to be anywhere near as strong as one might fear. Once again, BOTH 3d bar plots AND diamond graphs are at fault for not giving any indication of variability/noise/error bars/..., and BOTH could be fiddled with to improve this. In this case, it is quite impossible to see from the diamond graphs in figure 7 of the TAS article what is quite clear from the tables in the original source. My background is AI, not medicine, so I came to these articles with a "machine learning" bias. I was expecting to see models trained on a subset of the data and evaluated on another subset (cross-validation). None of them did. One of the many things to like about R that R makes it comparatively easy to do cross-validation. What have we seen as common themes? 1. The so-called "categorical" variables were (in 5 out of 6 cases) measured as continuous variables and then cut to quartiles or quintiles or the like. 2. More explanatory variables than 2 were considered in the sources, and in each case more than 2 were actually important or at least useful. 3. Presenting information without "error bars" can be seriously misleading. How does R help? 1. R lets us do scatter plots, smoothing, density estimation, &c. 2. R gives us "lattice" plots, amongst others. 3. We can construct graphs with error bars in R. The big challenge seems to be graphical presentation of higher- dimensional data, things like spinning plots, grand tours, &c. And for that, there's Rgobi.