Hi all there Can some one clarify me on this issue, features wise which is better R or SAS, leaving the commerical aspect associated with it. I suppose there are few people who have worked on both R and SAS and wish they would be able to help me in deciding on this. THank you for the help --------------------------------- [[alternative HTML version deleted]]
Hello, I think you'll get some bias if you do a survey on what software is better SAS or R in a R mailing list. Personnaly, what I like in R is : - the fact to you have to know what you are doing to do it, - the graphic potential is outstanding (only limited by the imagination of the user, just check the graph on the web page of R : http://www.r-project.org/ and try to do so with SAS, goooood luck !! ) - the mailing list is great - and many many many other things what I like in SAS : - nothing So in my opinion, leaving or considering the commercial aspect (but that counts too) I think there is no competition. Something else, go on http://www.sas.com/ and search for the word "statistic", missing, isn't it curious for a statistical software ? neela v a ??crit :>Hi all there > >Can some one clarify me on this issue, features wise which is better R or SAS, leaving the commerical aspect associated with it. I suppose there are few people who have worked on both R and SAS and wish they would be able to help me in deciding on this. > >THank you for the help > >-- Romain FRANCOIS : francoisromain at free.fr 06 18 39 14 69 / 01 46 80 65 60 _______________________________________________________ Etudiant en 3eme ann??e Institut de Statistique de l'Universit?? de Paris (ISUP) Fili??re Industrie et Services http://www.isup.cicrp.jussieu.fr/
Hi, On Sat, 20 Nov 2004, neela v wrote:> Hi all there > > Can some one clarify me on this issue, features wise which is better R or SAS, leaving the commerical aspect associated with it. I suppose there are few people who have worked on both R and SAS and wish they would be able to help me in deciding on this.R is THE one to use when it comes to graphics. So far, I haven't seen any other software that can produce better graphs (and I've used SAS, Minitab, Excel, Genstat...etc). It's programming feature, IMHO, is also neater than SAS's procedure-oriented programming. Cheers, Kevin -------------------------------- Ko-Kang Kevin Wang PhD Student Centre for Mathematics and its Applications Building 27, Room 1004 Mathematical Sciences Institute (MSI) Australian National University Canberra, ACT 0200 Australia Homepage: http://wwwmaths.anu.edu.au/~wangk/ Ph (W): +61-2-6125-2431 Ph (H): +61-2-6125-7407 Ph (M): +61-40-451-8301
There was at least one previous discussion of SAS vs. R on this list. I searched my archives (below). The thread begins at http://finzi.psych.upenn.edu/R/Rhelp01/archive/5947.html but then changes title. Two arguments for SAS: 1. You may be working with other people who use it. (That is why many of my colleagues use it, aside from inertia and [rational - since they have no choice] ignorance of alternatives.) 2. It used to be true, and it may still be true, that SAS compiles all code before running it, whereas R uses many compiled routines but does not compile the code you write yourself. Thus, SAS may be faster for huge data sets, like census data. Jon -- Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron R search page: http://finzi.psych.upenn.edu/
neela v writes:> Hi all there > > Can some one clarify me on this issue, features wise which is better R or SAS, leaving the commerical aspect associated with it. I suppose there are few people who have worked on both R and SAS and wish they would be able to help me in deciding on this. > > THank you for the help >I'm definitely biased towards R, but I'll try to be the devil's advocate and point some advantages of SAS. - There's a huge collection of data-manipulation features. You can parse all sort of weird files using DATA/INPUT statements. You can reshape, merge, combine, summarize, define and propagate missing observations, (...) using the inumerous features of the 'SAS DATA Step Language', as the manuals call it. Sure, there's much of it you can do in plain R, but R comes from the Unix tradition of specifc tools for specific tasks, so it doesn't try to be everything to everyone. In order to match SAS' Data Step Lanaguage using R, you would have to use a language more fitted for such tasks, like Perl/awk for parsing and a SQL Database for storing and reshaping. These languages integrate nicely with R, and if you can really exploit their potential, R + Perl/awk + SQL will definetly surpass SAS data-manipulation features. - There are some statistical analysis which are more completely implemented in SAS. PROC VARCOMP comes to mind (I know you can use lme for mixed models, but you have to use a cumbersome syntax when you're not dealing with nested but crossed random effects). Sure, there's nothing stoping you from using R and adapting it to your needs or implementing the analysis you need to do. While at first it will take you more time to actually think about the problem and how to implement it, the flexibility you'll gain will probably pay off. Summarizing, I think the main difference between SAS and R is their philosophy and how they reflect on their implementations. While SAS aims to "(...) provide data analysts one system to meet all their computing needs*", R tries to be the best tool for the specific task of statistical analysis. SAS's approach allows you to learn only one single language and use it for almost all computing needs you'll have. While it seems appealing on first sight, after you reach a certain level of proficiency, the lack of flexibility of this approach starts to limit what you can actually do. You'll be locked inside what you can do using the PROC/DATA procedures provided by SAS; if you want to implement new analysis etc, you will have a hard time. R's aproach on the other hand, may seem harder at first, as you'll have to learn (if you don't know some of them already, that is) specific languages for certain tasks, such as* LaTeX or HTML for reports, SQL databases, Perl/Awk for data parsing, C/C++/Fortran for implementing high performance functions, etc. You'll be higly compensanted though by the gains in flexibility and productivity. Not to mention that the more prociency you have in R and the other tools of your choice, more flexibility and power you'll have at hand to implement new analysis, manipulate data, create reports dinamically and so on. It nails down to how much time/effort you are willing to spend and for how long you're going to be using the languages. If you prefer to spend at first a great amount of money, and less time, by learning only one language, and don't mind being limited in the future on the range of things you'll be able to do, SAS is the appropriate choice. If you can spare the time to learn R and a set of appropriate tools for each task you'll want to do, it'll take you more time at first but in the end you'll have much more power and flexiblity at hands. * SAS User's Guide: Basics. * Strictly speaking you don't have to learn any of those. You can get along well using plain R in the beginning, but in order to exploit the power of it's approach, you'll find yourself in need to use one or more of them. -- Fernando Henrique Ferraz P. da Rosa http://www.ime.usp.br/~feferraz
neela v wrote:> Hi all there > > Can some one clarify me on this issue, features wise which is better R or SAS, leaving the commerical aspect associated with it. I suppose there are few people who have worked on both R and SAS and wish they would be able to help me in deciding on this. > > THank you for the helpI estimate that SAS is about 11 years behind R in statistical analysis and graphics capabilities, and that gap is growing. Also, code in SAS is not peer-reviewed as is code in R. SAS has advantages in two areas: dealing with massive datasets (generally speaking, > 1GB) and getting more accurate P-values in mixed effect models. See http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RS/sintro.pdf for one comparison of SAS and S on technical features. There are companies whose yearly license fees to SAS total millions of dollars. Then those companies hire armies of SAS programmers to program an archaic macro language using old statistical methods to produce ugly tables and the worst graphics in the statistical software world. Frank Harrell SAS User, 1969-1991 -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University
neela v wrote:> Can some one clarify me on this issue, features wise which is better > R or SAS, leaving the commerical aspect associated with it. I suppose > there are few people who have worked on both R and SAS and wish they > would be able to help me in deciding on this.My personal proclivities incline me to use SAS (or Minitab) for analysing mixed effects models. I have never been able to get my head around the S/R syntax for such models, whereas the SAS/Minitab syntax seems perfectly clear to me. For ***ANYTHING*** else I would use R. (In my work I don't encounter the multi-multi-megabyte data sets for which SAS apparently has a pronounced advantage with respect to speed.) cheers, Rolf Turner rolf at math.unb.ca
Two major advantages of SAS that seems to have been overlooked in the previous replies are: 1) The data-set language is SAS for data manipulation is more human-readable than R-code in general. R is not a definite write-only laguage as APL, but in particular in datamanipulation it is easy to write code that is impossible to decipher after few weeks. You can also produce unreadable code in SAS, but it generally takes more of an effort. Thus: Data manupulation is easier to document in SAS than in R. 2) proc tabulate. This procedure enables you to do extensive sensible tabulation of your data if you are prepared to read the manual. (This is not a product of the complexity of the software, but of the complexity of the tabulation features). Compared to this only rudimentay tools exist in R (afaik). So if you want to do well documented data manipulation and clear and compact tables go for SAS. If you want to do statistical analyses and graphics (in finite time) go for R. Bendix Carstensen ---------------------- Bendix Carstensen Senior Statistician Steno Diabetes Center Niels Steensens Vej 2 DK-2820 Gentofte Denmark tel: +45 44 43 87 38 mob: +45 30 75 87 38 fax: +45 44 43 07 06 bxc at steno.dk www.biostat.ku.dk/~bxc ----------------------> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of neela v > Sent: Saturday, November 20, 2004 9:18 AM > To: r-help at stat.math.ethz.ch > Subject: [R] SAS or R software > > > Hi all there > > Can some one clarify me on this issue, features wise which is > better R or SAS, leaving the commerical aspect associated > with it. I suppose there are few people who have worked on > both R and SAS and wish they would be able to help me in > deciding on this. > > THank you for the help > > > --------------------------------- > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read > the posting guide! http://www.R-project.org/posting-guide.html >
I can't resist dipping my oar in here: For me, some significant advantages of SAS are - Ability to input data in almost *any* conceivable form using the combination of features available through input/infile statements, SAS informats and formats, data step programming, etc. Dataset manipulation (merge, join, stack, subset, summarize, etc.) also favors SAS in more complex cases, IMHO. - Output delivery system (ODS): *Every* piece of SAS output is an output object that can be captured as a dataset, rendered in RTF, LaTeX, HTML, PDF, etc. with a relatively simple mechanism (including graphs) ods pdf file='mystuff.pdf''; << any sas stuff>> ods pdf close; - If you think the output tables are ugly, it is not difficult to define a template for *any* output to display it the way you like. - ODS Graphics (new in SAS 9.1) will extend much of this so that statistical procedures will produce many graphics themselves with ease. One significant disadvantage for me is the difficulty of composing multipanel graphic displays (trellis graphics, linked micromaps, etc.) due to the lack of an overall, top-down graphics environment. As well, there are a variety of kinds of graphics I've found extraordinarily frustrating to try to do in SAS because of lack of coherence or generality in the output available from procedures --- an example would be effect displays, such as implemented in R in the effects package. I can't agree, however, with Frank Harrell that SAS produces 'the worst graphics in the statistical software world.' One can get ugly graphs in R almost as easily in SAS just by accepting the 80-20 rule: You can typically get 80% of what you want with 20% of the effort. To get what you really want takes the remaining 80% of effort. On the other hand, the active hard work of many R developers has led to R graphics for which the *default* results for many graphs avoid many of the egregious design errors introduced in SAS in the days of pen-plotters (+ signs for points, cross-hatching for area fill). -Michael -- Michael Friendly Email: friendly at yorku.ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html Toronto, ONT M3J 1P3 CANADA
Hi Neela, Just my ??2 regarding the R vs. SAS issue. I started to use SAS since my undergraduate studies in 1993, and I am now a migrating to R user, and I hope not to ever go back to SAS. My comments are: - SAS is huge it takes over 1GB of space of my PC while R takes just over 100 MB, this days that may not be an important issue, but I'd consider this in your analysis. - One of the things that bother me the most about SAS was their "system update" protocol, it was a mess (at least last time I tried downloading the updates, like 2 years ago). R has a great community and I love the update.packages() function. - Regarding the large datasets discussion, I've had some experience working with mixed models on large dataset, and both programs end up taking a lot of time or crashing. - The learning curve for a non-stats non-programmer guy was steeper for SAS, but again SAS was my first. However, R still does not come so naturally to me when I am coding, I have to look at my old codes, manuals and google. If I ever end up teaching applied stats I'll teach using R. Sincerely, Jose ... migrating to R ... and one of this days to linux -- Jose A. Hernandez Ph.D. Candidate Precision Agriculture Center Department of Soil, Water, and Climate University of Minnesota 1991 Upper Buford Circle St. Paul, MN 55108 Ph. (612) 625-0445, Fax. (612) 625-2208
neela v writes:> Hi all there > > Can some one clarify me on this issue, features wise which isbetter R or SAS, leaving the commerical aspect associated with it. I suppose there are few people who have worked on both R and SAS and wish they would be able to help me in deciding on this.> > THank you for the help >I very much doubt you can make an informed decision if you leave the commercial aspect (license) aside. A single Base SAS installation (server) can cost tens of thousands of [[your currency here; may need to multiply by 10 or 100 or more]] in the first year, then a percentage of that in the following years. (SAS software is not purchased, but licensed on a yearly basis.) Want more than Base SAS? Prepare your wallet: thousands upon thousands (per year) for regression, anova, clustering (SAS/Stat), graphics (SAS/Graph), time series (SAS/ETS), optimizations (SAS/OR) etc. Then, if you want decision trees and neural networks (Enterprise Miner), I warmly recommend you to quickly find a chair and sit down before you hear the price tag. Will you always work for an organization that licenses SAS software? Will the organization license all the modules you'll need? Will those modules do everything you want? As others have said, R is a lot more flexible, and the GPL ensures that whatever you can do today will continue to be expanded and improved (much faster than SAS Institute would want or be able to expand/improve SAS). All in all, if you're primarily interested in data analysis (and don't want, for example, to get a job as a SAS programmer) and still choose SAS, you will regret it one day. The benefits are few (such as robust manipulation of massive data sets - I mean in excess of hundreds of millions of rows) and the risks are high (whatever you do is dependent on proprietary, very expensive software). With R, almost the opposite is true: lots of benefits and no risks (nothing can take R away from you). HTH, b. __________________________________
> - Output delivery system (ODS): *Every* piece of SAS output is an > output object that can be captured as > a dataset, rendered in RTF, LaTeX, HTML, PDF, etc. with a relatively > simple mechanism (including graphs) > ods pdf file='mystuff.pdf''; > << any sas stuff>> > ods pdf close;R now has this ability as well via the "sinkplot" and "textplot" commands provided by the "gplots" package. -G LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}
I apologize for adding this so late to the "SAS or R software " thread. This is a question, not a reply, but it seems to me to fit in well with the subject of this thread. I would like to know anyone's experiences in the following two areas below. I should add I have no experience myself in these areas: 1) Migrating from SAS to R in the choice of statistical software used for FDA reporting. (For example, was there more effort involved in areas of documentation, revision tracking, or validation of software codes?) 2) Migrating from SAS to R in the choice of statistical software used for NIH reporting (or other US or non-US) government agencies) . I find myself using R more and more and being continually amazed by its breadth of capabilities, though I have not tried ordering pizza yet. I use SAS, S-Plus, and, more recently, R for survival analysis and recurrent events in clinical trials. Alex Cambon Biostatistician School of Public Health and Information Sciences University of Louisville
I apologize for adding this so late to the "SAS or R software " thread. This is a question, not a reply, but it seems to me to fit in well with the subject of this thread. I would like to know anyone's experiences in the following two areas below. I should add I have no experience myself in these areas: 1) Migrating from SAS to R in the choice of statistical software used for FDA reporting. (For example, was there more effort involved in areas of documentation, revision tracking, or validation of software codes?) 2) Migrating from SAS to R in the choice of statistical software used for NIH reporting (or other US or non-US) government agencies) . I find myself using R more and more and being continually amazed by its breadth of capabilities, though I have not tried ordering pizza yet. I use SAS, S-Plus, and, more recently, R for survival analysis and recurrent events in clinical trials. Alex Cambon Biostatistician School of Public Health and Information Sciences University of Louisville
One point that is missing in this discussion is ease of review by the statistician at the FDA. As a statistician in clinical trials, you want to make it as easy as possible for your colleague at the FDA to do their job, so you put the programs in a format that they are more likely to find useful. As more reviewing statisticians are familiar with SAS than with other statistical packages/languages, I feel more comfortable sending a submission in SAS. Reviewers do use the programs in a filing that were written for creation of data sets and analysis to check for correct variable definitions and appropriate analyses. If the programs are written in a package/language that the reviewer understands they can get their job done easier. Given this, I have used S-Plus in regulatory work where it was clearly stronger--at some point I hope to use R where it adds a clear benefit. Of course 95% of the statistical analyses that are performed for regulatory submission can probably be done just as well with any of the major statistical package/languages available. $0.02 --Matt Matt Austin Statistician Amgen One Amgen Center Drive M/S 24-2-C Thousand Oaks CA 93021 (805) 447 - 7431> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch]On Behalf Of Marc Schwartz > Sent: Friday, December 17, 2004 16:19 PM > To: Alexander C Cambon > Cc: R-Help; Douglas Bates > Subject: Re: [R] SAS or R software > > > On Fri, 2004-12-17 at 17:11 -0500, Alexander C Cambon wrote: > > I apologize for adding this so late to the "SAS or R software " > > thread. > > This is a question, not a reply, but it seems to me to fit in well > > with > > the subject of this thread. > > > > I would like to know anyone's experiences in the following two areas > > below. I should add I have no experience myself in these areas: > > > > 1) Migrating from SAS to R in the choice of statistical > software used > > for FDA reporting. > > You will find that to be a non-issue from the FDA's perspective. This > has been discussed here with some frequency. If you search > the archives > you will find comments from Frank Harrell and others. > > The FDA does not and cannot endorse a particular software product. Nor > does it validate any statistical software for a specific purpose. They > do need to be able to reproduce the results, which means they need to > know what software product was used, which version and on > what platform, > etc. > > The SAS XPORT Transport Format (which is openly defined and > documented), > has been used for the transfer of data sets and has been available in > many statistical products. > > There have been a variety of activities (CDISC, HL-7, etc) > regarding the > electronic submission of data to the FDA. Some additional > information is > here: > > http://www.fda.gov/cder/regulatory/ersr/default.htm > > and here: > > http://www.cdisc.org/news/index.html > > Any other issues impacting the selection of a particular statistical > application are more likely to be political within your working > environment and FUD. > > As you are likely aware, other statistically relevant issues are > contained in various ICH guidance documents regarding GCP > considerations > and principles for clinical trials: > > http://www.ich.org/UrlGrpServer.jser?@_ID=475&@_TEMPLATE=272 > > > Keep in mind also that one big advantage R has (in my mind) is the use > of Sweave for the reproducible generation of reports, which > to an extent > are self-documenting. > > > > (For example, was there more effort involved in areas of > > documentation, revision tracking, or validation of software codes? > > Since the FDA's role with computer software and validation has been > raised before, the following documents cover many of these areas. The > list is not meant to be exhaustive, but should give a flavor in this > domain. > > There are specific guidance documents by the FDA pertaining > to software > that is contained in a medical device (ie. the firmware in a pacemaker > or medical monitoring equipment) or is used to develop a > medical device. > The current guidance in this case is here: > > http://www.fda.gov/cdrh/comp/guidance/938.html > > Other guidance pertains to 21 CFR 11, which addresses data management > systems used for clinical trials and covers issues such as electronic > signatures, audit trails and the like. A guidance document for that is > here: > > http://www.fda.gov/cder/guidance/5667fnl.htm > > Keep in mind, from a perspective standpoint, that even MS Excel and > Access can be made to be 21 CFR 11 compliant and there are companies > whose business is focused on just that task. > > There is also a general guidance document for computer systems used in > clinical trials here: > > http://www.fda.gov/ora/compliance_ref/bimo/ffinalcct.htm > > Though it is to be superseded by a draft document here: > > http://www.fda.gov/cder/guidance/6032dft.htm > > > > 2) Migrating from SAS to R in the choice of statistical > software used > > for NIH reporting (or other US or non-US) government agencies) . > > Same here to my knowledge. > > As I was typing this, I see Frank just responded. > > I also just noted Doug's post, so perhaps some of the above > information > will be helpful in clarifying some of his questions as well. > > I believe that the above is factually correct, but if someone knows > anything to not be so, please correct me. > > HTH, > > Marc Schwartz > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >
I've seen multiple comments about MS Excel's precision and accuracy. Can you please point me in the right direction in locating information about these? Thank you very much, Shawn Way, PE Engineering Manager Tanox, Inc. -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Frank E Harrell Jr Sent: Saturday, December 18, 2004 7:24 PM To: Henric Nilsson Cc: R-Help; Douglas Bates; MSchwartz at medanalytics.com; Alexander C Cambon; Jim Garrett Subject: Re: [R] SAS or R software Henric Nilsson wrote:> Frank E Harrell Jr said the following on 2004-12-18 15:03: > >> That is not clear. > > > Perhaps. And I think this is the issue. From the clients' perspective, > not a single FDA document states that you can use other software than > SAS. They haven't really thought about the fact that there isn't anyFDA> documents encouraging the use of SAS for statistical analyses.Right. This reminds me of the worst movie of all time, Plan 9 From Outer Space, in which the psychic Creskin closes the movie by saying "Can you prove that this DIDN'T happen?".> > I don't think that the real problem is convincing regulatory > authorities > that R (or any other (open-source) software for that matter) is > operating adequately. But clients and auditors seems to reason alongthe> lines of "rather being safe than sorry" and "nobody's ever beencritized> for using SAS". From their perspective, when we propose using `some > other' software they start thinking that it perhaps may jeopardizetheir> trial results (and, all to often, "but doesn't FDA require SAS?").Yes that is the hurdle.> > How to fight this? I don't know. Right now I'm thinking, "If you can > beat 'em, join 'em" and that the way of proving that `some other' > software works is through having similar documents and tools as the > commercial vendors.With the job market for statisticians being excellent, I've often wondered why clinical statisticians in industry are so often timid. Statisticians need to show strength and stamina, along with good teaching skills, on this issue.> >> And since FDA allows submissions using Excel, with not even an audit >> trail, and with known major statistical computing errors in Excel, I >> am fairly certain that it is not applicable or at the least is not >> enforced in any meaningful way. > > > The general preconception seems to be that neither SAS nor Excel needs > validation. E.g. the British guideline referenced in my previous email> states on p. 12 that > "It is generally considered that there is no requirement forvalidation> of commercial hardware and established operating systems or forpackages> such as the SAS system, Oracle and MS Excel, as entities in their own > right. However, most are configurable systems and so need adequate > control of installation and their configuration parameters."This makes me wonder about the British system. Have they not seen the serious calculation errors documented to be in Excel?> > Luckily for Excel, not a single word about precision and adequacy...Right. Thanks for your note Henric -Frank> > > Henric >-- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html