Richard Friedman
2011-May-23 23:23 UTC
[R] Analog of least significant difference error bars for proportions
Dear R-list, In the R-book, p.464, Michael Crawley recommends that error bars for bar plots of normally distributed continuous response variables with categorical explanatory variables be given by 1/2 of the least significant difference, where the least significant difference is defines as qt(0.975,degrees_of_freedom)*standard_error_of_the_difference. The idea is that the above quantity visually conveys whether or not the means are different more realistically than do standard errors. I have analyzed proportions with categorical variables using the glm function with a binomial error model. I wish to plot a bar graph with the height of the bars the proportions. Is there a way to define error bars analogous to the least significant difference bars described above that can convey the overlap of proportions? The experimentalists with whom I work just love error bars. I would like to make them as meaningful as possible. Thanks and best wishes, Rich ------------------------------------------------------------ Richard A. Friedman, PhD Associate Research Scientist, Biomedical Informatics Shared Resource Herbert Irving Comprehensive Cancer Center (HICCC) Lecturer, Department of Biomedical Informatics (DBMI) Educational Coordinator, Center for Computational Biology and Bioinformatics (C2B2)/ National Center for Multiscale Analysis of Genomic Networks (MAGNet) Room 824 Irving Cancer Research Center Columbia University 1130 St. Nicholas Ave New York, NY 10032 (212)851-4765 (voice) friedman at cancercenter.columbia.edu http://cancercenter.columbia.edu/~friedman/ I am a Bayesian. When I see a multiple-choice question on a test and I don't know the answer I say "eeney-meaney-miney-moe". Rose Friedman, Age 14
Rolf Turner
2011-May-24 00:03 UTC
[R] Analog of least significant difference error bars for proportions
On 24/05/11 11:23, Richard Friedman wrote:> Dear R-list, > > In the R-book, p.464, Michael Crawley recommends that error > bars for bar plots of normally distributed continuous response > variables with categorical explanatory variables be given by > 1/2 of the least significant difference, where the least significant > difference is defines as > > qt(0.975,degrees_of_freedom)*standard_error_of_the_difference. > > The idea is that the above quantity visually conveys whether or not > the means are different more realistically than do standard errors. > > I have analyzed proportions with categorical variables using > the glm function with a binomial error model. I wish to plot a bar > graph with the height of the bars the proportions. Is there a way > to define error bars analogous to the least significant difference bars > described above that can convey the overlap of proportions? > The experimentalists with whom I work just love error bars. I would > like to > make them as meaningful as possible.(1) The errbar() function in the Hmisc package will allow you to set any ``spread'' that you wish on your error bars. (2) In respect of maximal meaningfulness: The naive viewer tends to interpret error bars by concluding that if the ranges of two pairs of error bars do not overlap then the two quantities being estimated are ``significantly different''. Hence it strikes me that you might want to imitate what is done for the notches in boxplots, which are designed to make such an interpretation roughly correct. From the help on boxplot.stats():> The notches (if requested) extend to |+/-1.58 IQR/sqrt(n)|. This seems > to be based on the same calculations as the formula with 1.57 in > Chambers /et al./ (1983, p. 62), given in McGill /et al./ (1978, p. > 16). They are based on asymptotic normality of the median and roughly > equal sample sizes for the two medians being compared, and are said to > be rather insensitive to the underlying distributions of the samples. > The idea appears to be to give roughly a 95% confidence interval for > the difference in two medians.cheers, Rolf Turner [[alternative HTML version deleted]]
Richard Friedman
2011-May-24 15:46 UTC
[R] Analog of least significant difference error bars for proportions
Dear Rolf (and List), Thank you for your help on error bars. I fear that neither of the suggestions quite answer my immediate need. 1. Notches will not work because I have more than 2 levels. 2. The errbar function will useful once I know the error bars to put in. I thing I have figured it out but I would greatly appreciate feedback (positive or negative) from the list: For 2 or more levels with ordinary ANOVA the least significant error bars are given by (qt(0.975,degrees_of_freedom)sqrt((s1^2+s2^2)/sqrt(n))/2 Am I correct that for 3 levels the error bars are given by (qt(0.975,degrees_of_freedom)sqrt((s1^2+s2^2+s3^2)/sqrt(n))/2 where the argument of the first square root is the standard error of the sample mean? If I am correct, then an analogous express would seem to hold where the normal approximation is a good approximation to the binomial distribution. for 2 samples z(..475)sqrt(theta1(1-theta1)+theta2(1-theta2))/2 and for 3 samples z(..475)sqrt(theta1(1-theta1)+theta2(1-theta2)+theta3(1-theta3))/2 Does this sound right? Thanks and best wishes, Rich Date: Tue, 24 May 2011 12:03:16 +1200 From: Rolf Turner <rolf.turner at xtra.co.nz> To: Richard Friedman <friedman at cancercenter.columbia.edu> Cc: r-help at r-project.org Subject: Re: [R] Analog of least significant difference error bars for proportions Message-ID: <4DDAF5C4.3080901 at xtra.co.nz> Content-Type: text/plain On 24/05/11 11:23, Richard Friedman wrote:> Dear R-list, > > In the R-book, p.464, Michael Crawley recommends that error > bars for bar plots of normally distributed continuous response > variables with categorical explanatory variables be given by > 1/2 of the least significant difference, where the least significant > difference is defines as > > qt(0.975,degrees_of_freedom)*standard_error_of_the_difference. > > The idea is that the above quantity visually conveys whether or not > the means are different more realistically than do standard errors. > > I have analyzed proportions with categorical variables using > the glm function with a binomial error model. I wish to plot a bar > graph with the height of the bars the proportions. Is there a way > to define error bars analogous to the least significant difference > bars > described above that can convey the overlap of proportions? > The experimentalists with whom I work just love error bars. I would > like to > make them as meaningful as possible.(1) The errbar() function in the Hmisc package will allow you to set any ``spread'' that you wish on your error bars. (2) In respect of maximal meaningfulness: The naive viewer tends to interpret error bars by concluding that if the ranges of two pairs of error bars do not overlap then the two quantities being estimated are ``significantly different''. Hence it strikes me that you might want to imitate what is done for the notches in boxplots, which are designed to make such an interpretation roughly correct. From the help on boxplot.stats():> The notches (if requested) extend to |+/-1.58 IQR/sqrt(n)|. This seems > to be based on the same calculations as the formula with 1.57 in > Chambers /et al./ (1983, p. 62), given in McGill /et al./ (1978, p. > 16). They are based on asymptotic normality of the median and roughly > equal sample sizes for the two medians being compared, and are said to > be rather insensitive to the underlying distributions of the samples. > The idea appears to be to give roughly a 95% confidence interval for > the difference in two medians.cheers, Rolf Turner [[alternative HTML version deleted]]