Hello, I am looking for a sample size function for samples sizes, to test proportions that are not binomial proportions. The proportions represent a ratio of (final measure) / (baseline measure) on the same experimental unit. Searches using RSeek and such bring multiple hits for binomial proportions, but that doesn't seem to fit my situation. Perhaps there's some standard terminology from a different field that would provide better hits than deeming this a 'rate' or a 'proportion'. Of course, most sample size functions assume a normal distribution, while this data will be bounded between 0 and 1. The scientist I'm working with feels it's important to make fair comparisons, any weight loss must account for the baseline weight. A logistic transformation seems appropriate, but that term also didn't yield hits I recognized as useful. Loss of weight --- compare treatments: Treatment A: 1 - Final weight / Initial weight Treatment B: 1 - Final weight / Initial weight This appears to be a situation that would be common, but I'm not framing it in a way that matches an R package. Any guidance is appreciated. Regards, Paul Paul Prew ▪ Statistician 651-795-5942 ▪ fax 651-204-7504 Ecolab Research Center ▪ Mail Stop ESC-F4412-A 655 Lone Oak Drive ▪ Eagan, MN 55121-1560 CONFIDENTIALITY NOTICE: This e-mail communication and any attachments may contain proprietary and privileged information for the use of the designated recipients named above. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. [[alternative HTML version deleted]]
On Mar 23, 2010, at 11:05 AM, Prew, Paul wrote:> Hello, I am looking for a sample size function for samples sizes, to test proportions that are not binomial proportions. The proportions represent a ratio of (final measure) / (baseline measure) on the same experimental unit. Searches using RSeek and such bring multiple hits for binomial proportions, but that doesn't seem to fit my situation. Perhaps there's some standard terminology from a different field that would provide better hits than deeming this a 'rate' or a 'proportion'. > > Of course, most sample size functions assume a normal distribution, while this data will be bounded between 0 and 1. The scientist I'm working with feels it's important to make fair comparisons, any weight loss must account for the baseline weight. A logistic transformation seems appropriate, but that term also didn't yield hits I recognized as useful. > > Loss of weight --- compare treatments: > Treatment A: 1 - Final weight / Initial weight > Treatment B: 1 - Final weight / Initial weight > > This appears to be a situation that would be common, but I'm not framing it in a way that matches an R package. Any guidance is appreciated. > > Regards, PaulIf you and the scientist are in a position of being open to better options of analyzing "change from baseline" data, I would recommend that you both read the following two papers: Statistics notes: analysing controlled trials with baseline and follow up measurements. Vickers AJ, Altman DG. BMJ 2001;323:1123?4. http://www.bmj.com/cgi/content/full/323/7321/1123 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1121605/pdf/1123.pdf The use of percentage change from baseline as an outcome in a controlled trial is statistically inefficient: a simulation study. Vickers AJ. BMC Med Res Methodol 2001;1:6. http://www.biomedcentral.com/1471-2288/1/6 http://www.biomedcentral.com/content/pdf/1471-2288-1-6.pdf and review an additional web site: http://biostat.mc.vanderbilt.edu/wiki/Main/MeasureChange Once you are hopefully in a position of adopting a regression based approach (eg. FinalWeight ~ BaseWeight + Treatment), there are various options for calculating sample sizes. The key advantage of this approach is that you get the baseline adjusted between-group comparison (the regression beta coefficient and confidence intervals for Treatment) which is the key outcome of interest in comparing treatments in a parallel design. The easiest, albeit conservative approach for sample size, is to use power.t.test() on your assumptions of the inter-group delta for actual weight change (not percent change), the std dev for actual change, desired power and target alpha. I am not aware off-hand of any power/sample size functions in R for regular linear regression, though they may exist. There are third party programs that do provide that functionality. If you are willing to code and experiment a bit, you could construct a monte carlo simulation with a linear model, using data generated with rnorm() based upon reasonable assumptions about the distribution of your data in each group for the baseline and final values. Once you get your actual data collected and ready for analysis, you will also need to test for a baseline*treatment interaction (FinalWeight ~ BaseWeight * Treatment), which can make the interpretation of treatment effects more complicated, since the treatment effect will be conditional upon the baseline weight, rather than being able to report a mean treatment effect. HTH, Marc Schwartz
Dear Marc, Thank you very much for the advice and the papers, it helps. Regards, Paul From: >Marc Schwartz <marc_schwartz@me.com> To: "Prew, Paul" <Paul.Prew@ecolab.com> Cc: r-help@r-project.org Subject: Re: [R] Sample size for proportion, not binomial Message-ID: <C56551A0-FCE8-4FFE-8857-D16DD3827D11@me.com> Content-Type: text/plain; charset=windows-1252 On Mar 23, 2010, at 11:05 AM, Prew, Paul wrote:> Hello, I am looking for a sample size function for samples sizes, to test proportions that are not binomial proportions. The proportions represent a ratio of (final measure) / (baseline measure) on the same experimental unit. Searches using RSeek and such bring multiple hits for binomial proportions, but that doesn't seem to fit my situation. Perhaps there's some standard terminology from a different field that would provide better hits than deeming this a 'rate' or a 'proportion'. > > Of course, most sample size functions assume a normal distribution, while this data will be bounded between 0 and 1. The scientist I'm working with feels it's important to make fair comparisons, any weight loss must account for the baseline weight. A logistic transformation seems appropriate, but that term also didn't yield hits I recognized as useful. > > Loss of weight --- compare treatments: > Treatment A: 1 - Final weight / Initial weight > Treatment B: 1 - Final weight / Initial weight > > This appears to be a situation that would be common, but I'm not framing it in a way that matches an R package. Any guidance is appreciated. > > Regards, PaulIf you and the scientist are in a position of being open to better options of analyzing "change from baseline" data, I would recommend that you both read the following two papers: Statistics notes: analysing controlled trials with baseline and follow up measurements. Vickers AJ, Altman DG. BMJ 2001;323:1123?4. http://www.bmj.com/cgi/content/full/323/7321/1123 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1121605/pdf/1123.pdf The use of percentage change from baseline as an outcome in a controlled trial is statistically inefficient: a simulation study. Vickers AJ. BMC Med Res Methodol 2001;1:6. http://www.biomedcentral.com/1471-2288/1/6 http://www.biomedcentral.com/content/pdf/1471-2288-1-6.pdf and review an additional web site: http://biostat.mc.vanderbilt.edu/wiki/Main/MeasureChange Once you are hopefully in a position of adopting a regression based approach (eg. FinalWeight ~ BaseWeight + Treatment), there are various options for calculating sample sizes. The key advantage of this approach is that you get the baseline adjusted between-group comparison (the regression beta coefficient and confidence intervals for Treatment) which is the key outcome of interest in comparing treatments in a parallel design. The easiest, albeit conservative approach for sample size, is to use power.t.test() on your assumptions of the inter-group delta for actual weight change (not percent change), the std dev for actual change, desired power and target alpha. I am not aware off-hand of any power/sample size functions in R for regular linear regression, though they may exist. There are third party programs that do provide that functionality. If you are willing to code and experiment a bit, you could construct a monte carlo simulation with a linear model, using data generated with rnorm() based upon reasonable assumptions about the distribution of your data in each group for the baseline and final values. Once you get your actual data collected and ready for analysis, you will also need to test for a baseline*treatment interaction (FinalWeight ~ BaseWeight * Treatment), which can make the interpretation of treatment effects more complicated, since the treatment effect will be conditional upon the baseline weight, rather than being able to report a mean treatment effect. HTH, Marc Schwartz Paul Prew ▪ Statistician 651-795-5942 ▪ fax 651-204-7504 Ecolab Research Center ▪ Mail Stop ESC-F4412-A 655 Lone Oak Drive ▪ Eagan, MN 55121-1560 [[alternative HTML version deleted]]