RexBryan
2002-Dec-10 00:35 UTC
[R] FW: Answers to "Problem with differences between S+ and R in parsing output tables with $"
To R-wizards: Thank you for your quick an insightful responses. I now have R running a translated version of S+ code for power calculations for non-normal data. This is from the Michael Crawley's new book Statistical Computing--An Introduction to Data Analysis using S-Plus. Peter Dalgaard corrected me on the issue of the $ operator not being a parser. It sure looked like an advanced Regex type command to me! Andy Liaw was there quickly with the answer, while Ben Bolker and Douglas Bates showed the importance of the str() function to sort out problems like these. Thanks to all. I will ask more questions in the future as I plunge more and more into R. The code set forward by M. Crawley seems to eloquently establish a fundamental calculating engine for determining the power of a test for non-normal data. If one has seen how many contorted lines are required for the same thing in programs like C++, FORTRAN, SAS or Statistica you would appreciate my use of term "eloquence". How-to discussions of the calculation of Power [1 - F(-)] tends to be glossed over even in them most advanced of statistical books. Not so for Statistical Computing...for that I recommend it. The code with S+ and R variants that work is shown below. It would be nice if someone could come up with a version that can be used by both S+ and R without transform. The use of the statement "summary(aov(model))" seems more universal than the statement "summary.aov(model)". Anyone know why?>d<-numeric(1000) >n<-350 >for(i in 1:1000){ > y1<-rnbinom(n,1,.3) > y2<-rnbinom(n,1,.25) > y<-c(y1,y2) > fa<-factor(c(rep(1,n),rep(2,n))) > model<-glm(y~fa,poisson) ># S+ variants >#d[i]<-as.vector(summary(aov(model))$"Pr(F)")[1]} #A. Does work inS+>#d[i]<-as.vector(summary.aov(model)$"Pr(F)")[1]} #B. Does work inS+>#------------------------------------------------------------------------># R variants >d[i]<-as.vector(summary(aov(model))[[1]]$"Pr(>F)"[1])#C. Does work in R>#------------------------------------------------------------------------>}Essentially an F test comparing two means is been performed 1000 times, with 1000 tables produced and only the alpha level extracted and kept. These 1000 alphas are kept in vector d[]. The question then is asked: How many times out of 1000 does a random sampling of two distributions known to have different means produce a statistically significant difference? That proportion is considered an estimate of the Power or the test. In this case the simulation could produce a number like 0.840 given when the alpha level of 0.05 and n1=n2=350 are set. Ex:> sum(d<0.05)/1000 >[1] 0.8400Why I call this a "Power calculating engine" is that: 1. Vectors y1 and y2 can have different n1 and n2. 2. y1 and y2 can come from ANY distribution, even complex ones 3. y1 and y2 can even come from distributions that are not easily defined, i.e. from bootstrapping. 4. The code can be tailored to estimate Power even when multi-stage tests are proposed. I hope that my intuition is correct. REX -----Original Message----- From: RexBryan [mailto:rexbryan1 at attbi.com] Sent: Monday, December 09, 2002 11:51 AM To: 'r-help at lists.R-project.org' Subject: Problem with differences between S+ and R in parsing output tables with $ R-wizards I have successfully run with S+ a statistical power calculation for non-normal distributions as presented in M. Crawley?s new book.? When I tried the newest version of R on the same code, the $ parse statement doesn't seem to retrieve the appropriate number from a table. Note that some of the cosmetic differences in the two tables have to be dealt with by the parser. Any ideas what's happening? REX # Begin R ------------------------------------------------------------- #> summary(aov(model))Df Sum Sq Mean Sq F value Pr(>F) fa 1 11.1 11.1 0.9327 0.3345 Residuals 698 8279.3 11.9 # R: ... trying to parse the table gives a NULL for the probability of F...> (summary(aov(model))$"Pr(>F)")[1]NULL # End R --------------------------------------------------------------- # Begin S+ ------------------------------------------------------------> summary(aov(model))Df Sum of Sq Mean Sq F Value Pr(F) fa 1 11.063 11.06286 0.9326708 0.3345044 Residuals 698 8279.314 11.86148 # S+ ... using $ to parse the table gives the right answer for the probability of F... (summary(aov(model))$"Pr(F)")[1] [1] 0.3345044 # End S+ --------------------------------------------------------------