If anyone wants to see a prime example of how inefficient it is to program in SAS, take a look at the SAS programs provided by the US Agency for Healthcare Research and Quality for risk adjusting and reporting for hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . The PSSASP3.SAS program is a prime example. Look at how you do a vector product in the SAS macro language to evaluate predictions from a logistic regression model. I estimate that using R would easily cut the programming time of this set of programs by a factor of 4. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University
Frank, I couldn't locate the program you mentioned. doyou mind being more specific? could you please point me to the file? i am just curious. thanks. On Thu, Feb 26, 2009 at 5:57 PM, Frank E Harrell Jr <f.harrell at vanderbilt.edu> wrote:> If anyone wants to see a prime example of how inefficient it is to program > in SAS, take a look at the SAS programs provided by the US Agency for > Healthcare Research and Quality for risk adjusting and reporting for > hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . > ?The PSSASP3.SAS program is a prime example. ?Look at how you do a vector > product in the SAS macro language to evaluate predictions from a logistic > regression model. ?I estimate that using R would easily cut the programming > time of this set of programs by a factor of 4. > > Frank > -- > Frank E Harrell Jr ? Professor and Chair ? ? ? ? ? School of Medicine > ? ? ? ? ? ? ? ? ? ? Department of Biostatistics ? Vanderbilt University > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- ==============================WenSui Liu Acquisition Risk, Chase Blog : statcompute.spaces.live.com I can calculate the motion of heavenly bodies, but not the madness of people.? -- Isaac Newton ===============================
2009/2/26 Frank E Harrell Jr <f.harrell at vanderbilt.edu>:> If anyone wants to see a prime example of how inefficient it is to program > in SAS, take a look at the SAS programs provided by the US Agency for > Healthcare Research and Quality for risk adjusting and reporting for > hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . > ?The PSSASP3.SAS program is a prime example. ?Look at how you do a vector > product in the SAS macro language to evaluate predictions from a logistic > regression model. ?I estimate that using R would easily cut the programming > time of this set of programs by a factor of 4.Plenty of examples ripe for sending to www.thedailywtf.com there. Like this: IF &N. = 1 THEN SUB_N = 1; IF &N. = 3 THEN SUB_N = 2; IF &N. = 4 THEN SUB_N = 3; IF &N. = 6 THEN SUB_N = 4; IF &N. = 7 THEN SUB_N = 5; IF &N. = 8 THEN SUB_N = 6; IF &N. = 9 THEN SUB_N = 7; IF &N. = 10 THEN SUB_N = 8; IF &N. = 11 THEN SUB_N = 9; IF &N. = 12 THEN SUB_N = 10; IF &N. = 13 THEN SUB_N = 11; IF &N. = 14 THEN SUB_N = 12; IF &N. = 15 THEN SUB_N = 13; IF &N. = 17 THEN SUB_N = 14; IF &N. = 18 THEN SUB_N = 15; IF &N. = 19 THEN SUB_N = 16; Of course it's possible to write code like that in any language, it just looks worse when it's in ALL CAPS and written in a style that looks like the 1980s and onward never happened. The question is whether it's possible to write this better in SAS. Most of us on this list could write it in R in a better way. Barry
Thanks for pointing me to the SAS code, Dr Harrell After reading codes, I have to say that the inefficiency is not related to SAS language itself but the SAS programmer. An experienced SAS programmer won't use much of hard-coding, very adhoc and difficult to maintain. I agree with you that in the SAS code, it is a little too much to evaluate predictions. such complex data step actually can be replaced by simpler iml code. On Thu, Feb 26, 2009 at 5:57 PM, Frank E Harrell Jr <f.harrell at vanderbilt.edu> wrote:> If anyone wants to see a prime example of how inefficient it is to program > in SAS, take a look at the SAS programs provided by the US Agency for > Healthcare Research and Quality for risk adjusting and reporting for > hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . > ?The PSSASP3.SAS program is a prime example. ?Look at how you do a vector > product in the SAS macro language to evaluate predictions from a logistic > regression model. ?I estimate that using R would easily cut the programming > time of this set of programs by a factor of 4. > > Frank > -- > Frank E Harrell Jr ? Professor and Chair ? ? ? ? ? School of Medicine > ? ? ? ? ? ? ? ? ? ? Department of Biostatistics ? Vanderbilt University > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- ==============================WenSui Liu Acquisition Risk, Chase Blog : statcompute.spaces.live.com I can calculate the motion of heavenly bodies, but not the madness of people.? -- Isaac Newton ===============================
How would this agency be convinced of adopting R code also.... how would these things work. Regards, Ajay www.decisionstats.com On Fri, Feb 27, 2009 at 4:27 AM, Frank E Harrell Jr < f.harrell@vanderbilt.edu> wrote:> If anyone wants to see a prime example of how inefficient it is to program > in SAS, take a look at the SAS programs provided by the US Agency for > Healthcare Research and Quality for risk adjusting and reporting for > hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . > The PSSASP3.SAS program is a prime example. Look at how you do a vector > product in the SAS macro language to evaluate predictions from a logistic > regression model. I estimate that using R would easily cut the programming > time of this set of programs by a factor of 4. > > Frank > -- > Frank E Harrell Jr Professor and Chair School of Medicine > Department of Biostatistics Vanderbilt University > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Frank, I can't see the code you mention - Web marshall at work - but I don't think you should be too quick to run down SAS - it's a powerful and flexible language but unfortunately very expensive. Your example mentions doing a vector product in the macro language - this only suggest to me that those people writing the code need a crash course in SAS/IML (the matrix language). SAS is designed to work on records and so is inapproprorriate for matrices - macros are only an efficient code copying device. Doing matrix computations in this way is pretty mad and the code would be impossible never mind the memory problems. SAS recognise that but a lot of SAS users remain familiar with IML. In IML by contrast there are inner, cross and outer products and a raft of other useful methods for matrix work that R users would be familiar with. OLS for example is one line: b = solve(X`X, X`y) ; rss = sqrt(ssq(y - Xb)) ; And to give you a flavour of IML's capabilities I implemented a SAS version of the MARS program in it about 6 or 7 years ago. BTW SPSS also has a matrix language. Gerard Frank E Harrell Jr <f.harrell at vander To bilt.edu> R list <r-help at stat.math.ethz.ch> Sent by: cc r-help-bounces at r- project.org Subject [R] Inefficiency of SAS Programming 26/02/2009 22:57 If anyone wants to see a prime example of how inefficient it is to program in SAS, take a look at the SAS programs provided by the US Agency for Healthcare Research and Quality for risk adjusting and reporting for hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . The PSSASP3.SAS program is a prime example. Look at how you do a vector product in the SAS macro language to evaluate predictions from a logistic regression model. I estimate that using R would easily cut the programming time of this set of programs by a factor of 4. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ********************************************************************************** The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. It is the policy of the Department of Justice, Equality and Law Reform and the Agencies and Offices using its IT services to disallow the sending of offensive material. Should you consider that the material contained in this message is offensive you should contact the sender immediately and also mailminder[at]justice.ie. Is le haghaidh an duine n? an eintitis ar a bhfuil s? d?rithe, agus le haghaidh an duine n? an eintitis sin amh?in, a bhearta?tear an fhaisn?is a tarchuireadh agus f?adfaidh s? go bhfuil ?bhar faoi r?n agus/n? faoi phribhl?id inti. Toirmisctear aon athbhreithni?, atarchur n? leathadh a dh?anamh ar an bhfaisn?is seo, aon ?s?id eile a bhaint aisti n? aon ghn?omh a dh?anamh ar a hiontaoibh, ag daoine n? ag eintitis seachas an faighteoir beartaithe. M? fuair t? ? seo tr? dhearmad, t?igh i dteagmh?il leis an seolt?ir, le do thoil, agus scrios an t-?bhar as aon r?omhaire. Is ? beartas na Roinne Dl? agus Cirt, Comhionannais agus Athch?irithe Dl?, agus na nOif?g? agus na nGn?omhaireachta? a ?s?ideann seirbh?s? TF na Roinne, seoladh ?bhair chol?il a dh?chead?. M?s rud ? go measann t? gur ?bhar col?il at? san ?bhar at? sa teachtaireacht seo is ceart duit dul i dteagmh?il leis an seolt?ir l?ithreach agus le mailminder[ag]justice.ie chomh maith. ***********************************************************************************
I would like to know if we can create a package in which r functions are renamed closer to sas language.doing so will help people familiar to SAS to straight away take to R for their work,thus decreasing the threshold for acceptance - and then get into deeper understanding later. since it is a package it would be optional only for people wanting to try out R from SAS.. Do we have such a package right now..it basically masks R functions to the equivalent function in another language just for user ease /beginners for example creating function for means procmeans<-function(x,y) + { summary ( subset(x,select=c(x,y)) + ) creating function for importing csv procimport <-function(x,y) + { read.csv( textConnection(x),row.names=y,na.strings=" " + ) creating function fo describing data procunivariate<-function(x)+ { summary(x) + ) regards, ajay www.decisionstats.com On Fri, Feb 27, 2009 at 4:27 AM, Frank E Harrell Jr < f.harrell@vanderbilt.edu> wrote:> If anyone wants to see a prime example of how inefficient it is to program > in SAS, take a look at the SAS programs provided by the US Agency for > Healthcare Research and Quality for risk adjusting and reporting for > hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . > The PSSASP3.SAS program is a prime example. Look at how you do a vector > product in the SAS macro language to evaluate predictions from a logistic > regression model. I estimate that using R would easily cut the programming > time of this set of programs by a factor of 4. > > Frank > -- > Frank E Harrell Jr Professor and Chair School of Medicine > Department of Biostatistics Vanderbilt University > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
I've actually used AHRQ's software to create Inpatient Quality Indicator reports. I can confirm pretty much what we already know; it is inefficient. Running on about 1.8 - 2 million cases, it would take just about a whole day to run the entire process from start to finish. That isn't all processing time and includes some time for the analyst to check results between substeps, but I still knew that my day was full when I was working on IQI reports. To be fair though, there are a lot of other factors (beside efficiency considerations) that go into AHRQ's program design. First, there are a lot of changes to that software every year. In some cases it is easier and less error prone to hardcode a few points in the data so that it is blatantly obvious what to change next year should another analyst need to do so. Second, the organizations that use this software often require transparency and may not have high level programmers on staff. Writing code so that it is accessible, editable, and interpretable by intermediate level programmers or analysts is a plus. Third, given that IQI reports are often produced on a yearly basis, there's no real need to sacrifice clarity, etc. for efficiency - you're only doing this process once a year. There are other points that could be made, but the main idea is I don't think it's fair to hold this software up, out of context, as an example of SAS's (or even AHRQs) inefficiencies. I agree that SAS syntax is nowhere near as elegant or as powerful as R from a programming standpoint, that's why after 7 years of using SAS I switched to R. But comparing the two at that level is like a racing a Ferrari and a Bentley to see which is the better car. [[alternative HTML version deleted]]
Three comments I actually think you can write worse code in R than in SAS: more tools = more scope for innovatively bad ideas. The ability to write bad code should not damm a language. I found almost all of the "improvements" to the multi-line SAS recode to be regressions, both the SAS and the S suggestions. a. Everyone, even those of you with no SAS backround whatsoever, immediately understood the code. Most of the replacements are obscure. Compilers are very good these days and computers are fast, fewer typed characters != better. b. If I were writing the S code for such an application, it would look much the same. I worked as a programmer in medical research for several years, and one of the things that moved me on to graduate studies in statistics was the realization that doing my best work meant being as UN-clever as possible in my code. Frank's comments imply that he was reading SAS macro code at the moment of peak frustration. And if you want to criticise SAS code, this is the place to look. SAS macro started out as some simple expansions, then got added on to, then added on again, and again, and .... with no overall blueprint. It is much like the farmhouse of some neighbors of mine growing up: 4 different expansions in 4 eras, and no overall guiding plan. The interior layout was "interesting" to say the least. I was once a bona fide SAS 'wizard' (and Frank was much better than me), and I can't read the stuff without grinding my teeth. S was once headed down the same road. One of the best things ever with the language was documented in the blue book "The New S Language", where Becker et al had the wisdom to scrap the macro processor. Terry Therneau
Dear Ajay, just to deny the implicit statement 'corporate user'='moron' surfacing here and there in this interesting thread :^). This might be a statistical regularity but should by no means be considered a theorem, as there are counter-examples available. You can find people willing to learn both languages, appreciate the difference between them and use each where it is particularly strong even in corporations and burosaurs of any kind. IMVHO, acceptance of R in the corporate world has little to do with syntax and much with legacies, (discharge of-) responsibilities and the distance between the decision maker/buyer and those who are actually working with the SW. Else, assuming that 'corporate users' are not at a significant cerebral disadvantage (which I like to), the penetration of R in education, small and large companies should be the same, which I'm afraid is not. So I believe it boils down to industrial organization and the open source vs. commercial development model, rather than to some kind of (more or less appropriate) function rebranding. It is the *difference* in syntax w.r.t. SAS that prompted the shift to R, in my case at least. It was its ease and 'cleanliness' of installation (no registry entries, no access to forbidden directories required) which allowed me to experiment with it without having to mess with the IT Dept. (which would probably have put an end to my quest). It was its open source nature that allowed me to install it anywhere I liked to. My 2 Euro-Cents Giovanni Disclaimer: just thinking of the Proc Step gives me shivers; yet I recognize SAS is fast and powerful. I could understand somebody wanting to execute SAS through R syntax, but the opposite is beyond my grasp. ------------------------------ Message: 72 Date: Wed, 4 Mar 2009 08:44:51 +1300 From: Rolf Turner <r.turner at auckland.ac.nz> Subject: Re: [R] Inefficiency of SAS Programming To: Ajay ohri <ohri2007 at gmail.com> Cc: "r-help-bounces at r-project.org" <r-help-bounces at r-project.org>, "Gerard M. Keogh" <GMKeogh at justice.ie>, list <r-help at stat.math.ethz.ch>, R, Greg Snow <Greg.Snow at imail.org> Message-ID: <8993CBA0-46A3-41DE-ABBB-29DB205FB713 at auckland.ac.nz> Content-Type: text/plain; charset="US-ASCII"; delsp=yes; format=flowed On 3/03/2009, at 5:58 PM, Ajay ohri wrote:> for an " inefficient " language , it sure has dominated the predictive> analytics world for 3 plus decades. I referred once to intellectual > jealousy between newton and liebnitz. > > i am going ahead and creating the R package called "Anne". > > It basically is meant only for SAS users who want to learn R , without> upsetting the schedule of the corporate users. > > Simply put , it is a wrapper on SAS language using the function > command...ie > procunivariate function in "Anne" package would call the summary > function > and so on...Reminds me of fortune(38). cheers, Rolf Turner ###################################################################### Attention:\ This e-mail message is privileged and confid...{{dropped:9}} Ai sensi del D.Lgs. 196/2003 si precisa che le informazi...{{dropped:13}}