On Thu, 04 Sep 2003 14:50:25 -0400 "Paul, David A" <paulda at BATTELLE.ORG> wrote:> I am one of only 5 or 6 people in my organization making the > effort to include R/Splus as an analysis tool in everyday work - > the rest of my colleagues use SAS exclusively. > > Today, one of them made the assertion that he believes the > numerical algorithms in SAS are superior to those in Splus > and R -- ie, optimization routines are faster in SAS, the SAS > Institute has teams of excellent numerical analysts that > ensure its superiority to anything freely available, PROC > NLMIXED is more flexible than nlme( ) in the sense that it > allows a much wider array of error structures than can be used > in R/Splus, &etc. > > I obviously do not subscribe to these views and would like > to refute them, but I am not a numerical analyst and am still > a novice at R/Splus. Do there exist refereed papers comparing the > numerical capabilities of these platforms? If not, are there > other resources I might look up and pass along to my colleagues? > > > > Much thanks in advance, > > david paulI don't have papers comparing the numerical capabilities but I say bunk to your colleagues. The last time I looked, SAS still relies on the out of date Gauss-Jordan sweep operator in many key places, in place of the QR decomposition that R and S-Plus use in regression. And SAS being closed source makes it impossible to see how it really does calculations in some cases. See http://hesweb1.med.virginia.edu/biostat/s/doc/splus.pdf Section 1.6 for a comparison of S and SAS (though this doesn't address numerical reliability). Overall, SAS is about 11 years behind R and S-Plus in statistical capabilities (last year it was about 10 years behind) in my estimation. Frank Harrell SAS User, 1969-1991 --- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University
I am one of only 5 or 6 people in my organization making the effort to include R/Splus as an analysis tool in everyday work - the rest of my colleagues use SAS exclusively. Today, one of them made the assertion that he believes the numerical algorithms in SAS are superior to those in Splus and R -- ie, optimization routines are faster in SAS, the SAS Institute has teams of excellent numerical analysts that ensure its superiority to anything freely available, PROC NLMIXED is more flexible than nlme( ) in the sense that it allows a much wider array of error structures than can be used in R/Splus, &etc. I obviously do not subscribe to these views and would like to refute them, but I am not a numerical analyst and am still a novice at R/Splus. Do there exist refereed papers comparing the numerical capabilities of these platforms? If not, are there other resources I might look up and pass along to my colleagues? Much thanks in advance, david paul
On Thu, 4 Sep 2003, Paul, David A wrote:> I am one of only 5 or 6 people in my organization making the > effort to include R/Splus as an analysis tool in everyday work - > the rest of my colleagues use SAS exclusively. > > Today, one of them made the assertion that he believes the > numerical algorithms in SAS are superior to those in Splus > and R -- ie, optimization routines are faster in SAS, the SASI can't say for the optimisation routines, but I have found this... When I was doing my MSc thesis, using tree-based models and neural networks for classifications, I discovered something interesting. Using SAS Enterprise Miner (SAS EM), its Tree Node is far more efficient than the rpart package. Using the same (or very similar at least) parameter settings, SAS EM can produce a tree in about 1 minute while it would take rpart 5 ~ 6 minutes (same data, same machine....). Having said that, I still prefer rpart as it can draw a beautiful tree, whereas it is very difficult to fit the graphical tree produced by SAS EM into one A4 page -- in the end I had to use the text tree. However, the Neural Network node in SAS EM is less efficient than nnet. The time it takes to fit a neural network in R using nnet is much faster.... -- Cheers, Kevin ------------------------------------------------------------------------------ "On two occasions, I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able to rightly apprehend the kind of confusion of ideas that could provoke such a question." -- Charles Babbage (1791-1871) ---- From Computer Stupidities: http://rinkworks.com/stupid/ -- Ko-Kang Kevin Wang Master of Science (MSc) Student SLC Tutor and Lab Demonstrator Department of Statistics University of Auckland New Zealand Homepage: http://www.stat.auckland.ac.nz/~kwan022 Ph: 373-7599 x88475 (City) x88480 (Tamaki)
Paul, David A wrote:>I am one of only 5 or 6 people in my organization making the >effort to include R/Splus as an analysis tool in everyday work - >the rest of my colleagues use SAS exclusively. > >Today, one of them made the assertion that he believes the >numerical algorithms in SAS are superior to those in Splus >and R -- ie, optimization routines are faster in SAS, the SAS >Institute has teams of excellent numerical analysts that >ensure its superiority to anything freely available, PROC >NLMIXED is more flexible than nlme( ) in the sense that it >allows a much wider array of error structures than can be used >in R/Splus, &etc. > >I obviously do not subscribe to these views and would like >to refute them, but I am not a numerical analyst and am still >a novice at R/Splus. Do there exist refereed papers comparing the >numerical capabilities of these platforms? If not, are there >other resources I might look up and pass along to my colleagues? > >I suspect it will be difficult to find the answer to your colleagues' assertions without doing your own studies. How important is it to you to settle this disagreement? One could always name the many leading statisticians who contribute to R, but I don't think that name dropping settles anything. Nonetheless, even if SAS were faster, that would be only part of the issue. As you know, R offers vastly better exploratory graphics, better graphics overall, far more flexible programming, user extensibility, and more natural programming access to the results of previous computations. So even if your colleagues were right in their assertions, they would be overlooking many capabilities of the S language that are not readily available in SAS. IMO, SAS shines in its ability to read files in almost any format, to handle gigantic data sets without burping, and to produce formatted cross-tabulations and other highly structured text reports. However, if your colleagues work at all in data exploration, they are ignoring important tools by not exploring R or S-Plus. -- Michael Prager, Ph.D. NOAA Center for Coastal Fisheries and Habitat Research Beaufort, North Carolina 28516 http://shrimp.ccfhrb.noaa.gov/~mprager/ DISCLAIMER: Opinions expressed are personal, not official. N...{{dropped}}
"Paul, David A" <paulda at BATTELLE.ORG> writes:> I am one of only 5 or 6 people in my organization making the > effort to include R/Splus as an analysis tool in everyday work - > the rest of my colleagues use SAS exclusively. > > Today, one of them made the assertion that he believes the > numerical algorithms in SAS are superior to those in Splus > and R -- ie, optimization routines are faster in SAS, the SAS > Institute has teams of excellent numerical analysts that > ensure its superiority to anything freely available, PROC > NLMIXED is more flexible than nlme( ) in the sense that it > allows a much wider array of error structures than can be used > in R/Splus, &etc. > > I obviously do not subscribe to these views and would like > to refute them, but I am not a numerical analyst and am still > a novice at R/Splus. Do there exist refereed papers comparing the > numerical capabilities of these platforms? If not, are there > other resources I might look up and pass along to my colleagues?Although they are out of date, there are some comparisons of accuracy in McCullough, B. D. (1998), "Assessing the reliability of statistical software: Part I", The American Statistician, 52, 149-159. McCullough, B. D. (1999), "Assessing the reliability of statistical software: Part II", The American Statistician, 53, 358-366. Regarding PROC NLMIXED versus nlme, there are a lot of differences between them. I don't think that PROC NLMIXED will handle nested random effects while nlme does. However, nlme assumes the underlying noise is Gaussian while PROC NLMIXED allows Gaussian or binomial or Poisson. PROC NLMIXED uses adaptive Gaussian quadrature to evaluate the marginal log-likelihood whereas nlme uses a less accurate evaluation but better parameterizations of the variance of the random effects. I think it would be difficult to declare one to be superior to the other.
Douglas Bates <bates at cs.wisc.edu> writes:> McCullough, B. D. (1998), "Assessing the reliability of statistical > software: Part I", The American Statistician, 52, 149-159. > > McCullough, B. D. (1999), "Assessing the reliability of statistical > software: Part II", The American Statistician, 53, 358-366.In my cutting-and-pasting I got those page numbers backwards. The 1998 article is on pages 358-366 and the 1999 one is on pages 149-159
On Thu, 4 Sep 2003, Paul, David A wrote:> I am one of only 5 or 6 people in my organization making the > effort to include R/Splus as an analysis tool in everyday work - > the rest of my colleagues use SAS exclusively. > > Today, one of them made the assertion that he believes the > numerical algorithms in SAS are superior to those in Splus > and R -- ie, optimization routines are faster in SAS, the SAS > Institute has teams of excellent numerical analysts that > ensure its superiority to anything freely available, PROC > NLMIXED is more flexible than nlme( ) in the sense that it > allows a much wider array of error structures than can be used > in R/Splus, &etc.While I don't subscribe to the general theory, they have a point about PROC NLMIXED. It does more accurate calculations for generalised linear mixed models than are currently available in R/S-PLUS, and for logistic random effects models the difference can sometimes be large enought to matter. -thomas
On Fri, 5 Sep 2003, Brian D. Ripley wrote:> In general I find such discussions irrelvant. > I bet those users make far, far more errors then any > of these packages do so.However, without having the discussions with my colleagues, nothing will ever change. The perception of SAS' "bestness" flows, in my experience from several things: a. It was developed long before Splus and R so more people are familiar with it, especially managers and other decision-makers. b. The FDA requires SAS transport version 5 datasets, and it is somewhat easier to use SAS throughout a clinical trial than to perform analyses in one package and convert data to another at the end. c. Because SAS costs so much $$, it _must_ be good (dumb, but people do think that) d. Because SAS is commercial software, a posteriori errors found in clinical trials analyses (and due to software issues) can be attributed by the NDA applicants to the SAS Institute. Lawyers really like this. Of course, Splus is also commercial and therefore does not suffer from criticism on these grounds. It is a fact of life that building a better mousetrap does not guarantee that the "world will beat a path to your door". Marketing and perception are very important! Part of my job involves defending choice of software, and since I'm swimming upstream by choosing to learn R, I need to have intelligent arguments to use when this choice is questioned. Given the responses to my original post, I now do have those arguments in hand. This merely confirms what is already obvious: this is an amazing listserv! Respectfully, david paul
Possibly Parallel Threads
- SUMMARY: Comparison of SAS & R/Splus
- Re: Course***May-June 2004***R/Splus Programming Techniques, @ 5 USA locations near you!]
- 2 Courses Near You - (1) Introduction to R/S+ programming: Microarrays Analysis and Bioconductor, (2) R/Splus Fundamentals and Programming Techniques
- Validation of R
- (no subject)