Dear all, There have been a variety of discussions on the R list regarding the use of R in clinical trials. The following post from the STATA list provides an interesting opinion regarding why SAS remains so popular in this arena: http://www.stata.com/statalist/archive/2008-01/msg00098.html Regards, -Cody Hamilton
Cody, How amazing that SAS is still used to produce reports that reviewers hate and that requires tedious low-level programming. R + LaTeX has it all over that approach IMHO. We have used that combination very successfully for several data and safety monitoring reporting tasks for clinical trials for the pharmaceutical industry. Frank Cody Hamilton wrote:> Dear all, > > There have been a variety of discussions on the R list regarding the use of R in clinical trials. The following post from the STATA list provides an interesting opinion regarding why SAS remains so popular in this arena: http://www.stata.com/statalist/archive/2008-01/msg00098.html > > Regards, > > -Cody Hamilton-- Frank E Harrell Jr Professor and Chairman School of Medicine Department of Biostatistics Vanderbilt University
Agreed. The tables in the pdf the poster at http://www.stata.com/statalist/archive/2008-01/msg00098.html links to look terrible compared to the standard I am used to from Hmisc::latex(). Just saying. -Ista On Wed, Feb 17, 2010 at 9:33 PM, Frank E Harrell Jr <f.harrell at vanderbilt.edu> wrote:> Cody, > > How amazing that SAS is still used to produce reports that reviewers hate > and that requires tedious low-level programming. ?R + LaTeX has it all over > that approach IMHO. ?We have used that combination very successfully for > several data and safety monitoring reporting tasks for clinical trials for > the pharmaceutical industry. > > Frank > > > Cody Hamilton wrote: >> >> Dear all, >> >> There have been a variety of discussions on the R list regarding the use >> of R in clinical trials. The following post from the STATA list provides an >> interesting opinion regarding why SAS remains so popular in this arena: >> http://www.stata.com/statalist/archive/2008-01/msg00098.html >> >> Regards, >> >> -Cody Hamilton > > -- > Frank E Harrell Jr ? Professor and Chairman ? ? ? ?School of Medicine > ? ? ? ? ? ? ? ? ? ? Department of Biostatistics ? Vanderbilt University > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org
Frank E Harrell Jr wrote:> Cody, > > How amazing that SAS is still used to produce reports that reviewers > hate and that requires tedious low-level programming. R + LaTeX has > it all over that approach IMHO. We have used that combination very > successfully for several data and safety monitoring reporting tasks > for clinical trials for the pharmaceutical industry. > > FrankI used to work for a research group that also used R + LaTeX to produce DSMB reports for clinical trials. If the DSMB members had only been exposed to SAS reports before, you could not get them to stop praising the quality of the R + LaTeX reports, even years into a trial. Erik
On 2/18/10, Frank E Harrell Jr <f.harrell at vanderbilt.edu> wrote:> How amazing that SAS is still used to produce reports that reviewers hate > and that requires tedious low-level programming. R + LaTeX has it all over >To simplify things, R + LyX could also be a solution. Liviu
It is easy to devolve into visceral response mode, lose objectivity and slip into intolerance. R, S, S-Plus, SAS, PASW (nee SPSS), STATA, are all tools. Each has strengths and weaknesses. No one is inherently better, or worse than the other. The quality of the results produced by anyone of them is a function of the abilities of the person who manipulates them. Don't expect quality work from any program unless the person running the program knows what he, or she is doing! John John Sorkin JSorkin at grecc.umaryland.edu -----Original Message----- From: Liviu Andronic <landronimirc at gmail.com> Cc: <r-help at r-project.org> To: Frank E Harrell Jr <f.harrell at vanderbilt.edu> Cc: Cody Hamilton <cody.shawn at yahoo.com> Sent: 2/18/2010 4:29:27 AM Subject: Re: [R] Use of R in clinical trials On 2/18/10, Frank E Harrell Jr <f.harrell at vanderbilt.edu> wrote:> How amazing that SAS is still used to produce reports that reviewers hate > and that requires tedious low-level programming. R + LaTeX has it all over >To simplify things, R + LyX could also be a solution. Liviu ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}}
Points regarding the advantages of LaTex are very well taken. If I were fortunate enough to have complete ownership of the document (as might be the case with a DSMB report produced by the Biostat group), then LaTex would be a wonderful choice.? Though I am not a LaTex user, I can easily imagine that the productivity gains could be considerable. Unfortunately, in most cases the Biostatistics group is responsible for providing a relatively small piece of the overall document which is owned by another group that inevitably uses MS Office. --- On Wed, 2/17/10, Erik Iverson <eriki at ccbr.umn.edu> wrote:> From: Erik Iverson <eriki at ccbr.umn.edu> > Subject: Re: [R] Use of R in clinical trials > To: "Frank E Harrell Jr" <f.harrell at vanderbilt.edu> > Cc: "Cody Hamilton" <cody.shawn at yahoo.com>, r-help at r-project.org > Date: Wednesday, February 17, 2010, 9:05 PM > Frank E Harrell Jr wrote: > > Cody, > > > > How amazing that SAS is still used to produce reports > that reviewers hate and that requires tedious low-level > programming.? R + LaTeX has it all over that approach > IMHO.? We have used that combination very successfully > for several data and safety monitoring reporting tasks for > clinical trials for the pharmaceutical industry. > > > > Frank > > I used to work for a research group that also used R + > LaTeX to produce DSMB reports for clinical trials.? If > the DSMB members had only been exposed to SAS reports > before, you could not get them to stop praising the quality > of the R + LaTeX reports, even years into a trial. > > Erik >
I am old enough to have lived through this particular transition. Prior to the advent of SAS, trials were analyzed by in-house written programs (usually in Fortran maybe with the help of IMSL). These programs were huge card decks. Having the card reader eat a card half way through reading the deck was a not unusual occurrence. I was responsible for deploying the first version of SAS. This meant compiling PL/I code stored on a magnetic tape and storing it on limited and expensive disk drives. It was several years before the transition from using in-house programs to SAS was completed. Yes there was a great deal of angst and I spent a lot of time convincing people that in the end there would be a cost advantage and overcoming institutional inertia. By the way, this was all done on computers that you will probably find only in a museum, if at all. These systems filled whole rooms and required a staff just to keep them running. Murray M Cooper, PhD Richland Statistics 9800 North 24th St Richland, MI 49083 -----Original Message----->From: "Christopher W. Ryan" <cryan at binghamton.edu> >Sent: Feb 18, 2010 1:08 PM >To: r-help at r-project.org >Cc: p.dalgaard at biostat.ku.dk >Subject: Re: [R] Use of R in clinical trials > >Pure Food and Drug Act: 1906 >FDA: 1930s >founding of SAS: early 1970s > >(from the history websites of SAS and FDA) > >What did pharmaceutical companies use for data analysis before there was >SAS? And was there much angst over the change to SAS from whatever was >in use before? > >Or was there not such emphasis on and need for thorough data analysis >back then? > >--Chris >Christopher W. Ryan, MD >SUNY Upstate Medical University Clinical Campus at Binghamton >425 Robinson Street, Binghamton, NY 13904 >cryanatbinghamtondotedu > >"If you want to build a ship, don't drum up the men to gather wood, >divide the work and give orders. Instead, teach them to yearn for the >vast and endless sea." [Antoine de St. Exupery] > >Bert Gunter wrote: >> DISCLAIMER: This represents my personal view and in no way reflects that of >> my company. >> >> Warning: This is a long harangue that contains no useful information on R. >> May be wise to delete without reading. >> ---------- >> >> Sorry folks, I still don't understand your comments. As Cody's original post >> pointed out, there are a host of factors other than ease of programmability >> or even quality of results that weigh against any change. To reiterate, all >> companies have a huge infrastructure of **validated SAS code** that would >> have to be replaced. This, in itself, would take years and cost tens of >> millions of dollars at least. Also to reiterate, it's not only >> statistical/reporting functionality but even more the integration into the >> existing clinical database systems that would have to be rewritten **and >> validated**. All this would have to be done while continuing full steam on >> existing submissions. It is therefore not surprising to me that no pharma >> company in its right mind even contemplates undertaking such an effort. >> >> To put these things into perspective. Let's say Pfizer has 200 SAS >> programmers (it's probably more, as they are a large Pharma, but I dunno). >> If each programmer costs, conservatively, $200K U.S. per year fully loaded, >> that's $40 million U.S. for SAS Programmers. And this is probably a severe >> underestimate. So the $14M quoted below is chicken feed -- it doesn't even >> make the radar. >> >> To add further perspective, a single (large) pivotal clinical trial can >> easily cost $250M . A delay in approval due to fooling around trying to >> shift to a whole new software system could easily cause hundreds of million >> to billions if it means a competitor gets to the marketplace first. So, to >> repeat, SAS costs are chicken feed. >> >> Yes, I suppose that the present system institutionalizes mediocrity. How >> could it be otherwise in any such large scale enterprise? Continuity, >> reliability, and robustness are all orders of magnitude more important for >> both the FDA and Pharma to get safe and efficacious drugs to the public. >> Constantly hopping onto the latest and greatest "craze" (yes, I exaggerate >> here!) would be dangerous, unacceptable, and would probably delay drug >> approvals. I consider this another example of the Kuhnsian paradigm (Thomas >> Kuhn: "The Structure of Scientific Revolutions")in action. >> >> This is **not** to say that there is not a useful role for R (or STATA or >> ...) to play in clinical trial submissions or, more generally, in drug >> research and development. There certainly is. For the record, I use R >> exclusively in my (nonclinical statistics) work. Nor is to say that all >> change must be avoided. That would be downright dangerous. But let's please >> keep these issues in perspective. One's enthusiasm for R's manifold virtues >> should not replace common sense and logic. That, too, would be unfortunate. >> >> Since I've freely blustered, I am now a fair target. So I welcome forceful >> rebuttals and criticisms and, as I've said what I wanted to, I will not >> respond. You have the last word. >> >> Bert Gunter >> Genentech Nonclinical Biostatistics > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
Marc, A thoughtful, well reasoned discussion. I welcome this kind of analysis. (I was not criticizing Bert, I was using his post as an example of an unreasonable statement made to, but not by him, that can serve as an object lesson for all of us.) I feel unhappy about posts which attack SAS, R or any other language just because the language is not R or SAS. Thoughtful comments like yours will get people to think about their choice of programming language. Many will chose R and that is good. John John Sorkin JSorkin at grecc.umaryland.edu -----Original Message----- From: Marc Schwartz <marc_schwartz at me.com> To: John Sorkin <jsorkin at grecc.umaryland.edu> To: Gunter Bert <gunter.berton at gene.com> Cc: Dieter Menne <dieter.menne at menne-biomed.de> Cc: <r-help at r-project.org> Sent: 2/19/2010 12:55:36 PM Subject: Re: [R] Use of R in clinical trials On Feb 19, 2010, at 6:56 AM, John Sorkin wrote:> Bert, > There is a lesson here. Just as intolerance of any statistical analysis program (or system) other than SAS should lead to our being drive crazy, so to should intolerance of > any statistical analysis program (or system) other than R. > John > >>>> Dieter Menne <dieter.menne at menne-biomed.de> 2/19/2010 3:46 AM >>> > > Bert, > > I like your comments. There is one issue, however, that drives me crazy > whenever I meet a customer asking "You are not using SAS? Too bad, we need > validated results." > > > Bert Gunter wrote: >> >> ... >> Also to reiterate, it's not only >> statistical/reporting functionality but even more the integration into the >> existing clinical database systems that would have to be rewritten **and >> validated**. >> >> > > Implicitly: Even if you let your cat enter SAS code, the results are > correct, because they SAS is validated. > > DieterIf I may, let me offer some comments, which in part, are supportive of Bert's perspective. First, the notion of validation that Bert raised should not be interpreted as indicating that SAS in a vacuum "out of the box" is validated, as if there was a parallel to a Good Housekeeping Seal of Approval for statistical software. There is no such thing for any software in this domain. Validation, in the context of regulated clinical trials (which we address in the R-FDA document available at http://www.r-project.org/doc/R-FDA.pdf) is defined by the FDA as: "Establishing documented evidence which provides a high degree of assurance that a specific process will consistently produce a product meeting its predetermined specifications and quality attributes." That is not something that can be provided by the vendor, it can only be done by the end user and their organization. Now, that language is of course subject to interpretation, as FDA guidance is just that, "guidance". It is not prescriptive. One takes a risk mitigation based approach to implementing internal procedures and policies. Internal validation is done via written Standard Operating Procedures (SOPs) that have been created, reviewed and approved by the end user's organization. The entire data path from the source data base to final report output and data sets must be tested to assure reliability and reproducibility. Thus, there is a significant amount of time and cost involved with this process and this is what Bert was referring to, which goes above and beyond the initial cost of the software and any annual licensing and support costs. It needs to be done irrespective of the software tool chain that one is using. The scope (therefore the cost) of the validation testing will be heavily impacted upon by the nature of one's environment (eg. "big pharma" versus a "boutique drug house" versus a medical device company versus an independent contract research organization) and the level of risk mitigation (defined by lawyers) required by the organization. These procedures and the associated documentation are also subject to on-site inspection by the FDA, which can shut you down if these are lacking. It is not that one trusts SAS' output implicitly or by default. It is that one has documented through extensive testing that data manipulation and output is reliable, reproducible and importantly, that one has also documented known problems (bugs, incorrect results) and workarounds, if any. The same applies to R. Thus, while R may be "free" in all senses of that word, the actual monetary cost differential relative to software purchase and support is only one part of the equation and therefore the "value proposition" to the organization. If one is to transition from SAS to R, then one's organization has to evaluate the total costs and risks associated with that transition. SOPs have to be written, reviewed and approved. Data management, analysis and reporting code has to be re-written, tested and validated. Interfaces to database servers have to be tested. Programmer's have to be re-trained in a new language and operating paradigm. Senior management has to be brought along to achieve a high level of comfort with anything new that may initially be seen as a risk factor in successfully doing business. A clear business case must be made to them that the advantages outweigh the potential risks. Hence, there is a lot of resistance to the use of R in the clinical trial realm for these large companies because of those costs and timelines. Add to that the normal human behavioral factors of being resistant to change and the hurdle for R in these large corporate environments is not trivial. To make the move from the pre-clinical drug discovery realm that Bert and others here work in using R, where some of these issues are not relevant, to the human clinical trial realm, we need to overcome that resistance. The organizations will need to also get to the point where the financial pressures are sufficient that paying millions of dollars per year in software licensing costs and the additional millions for the FTE's associated, become relevant from a bottom line perspective. When they get to that point and the value of R becomes clear to them, progress will be made. It will be incremental and evolutionary and will happen in some organizations earlier than others based upon their size, profile and operating environment. I might also point out that companies like SAS are sufficiently profitable, that in time, if pressure on their pricing is brought to bear, they will reduce their pricing to accommodate marketplace realities. They will reduce their margins rather than give up market share. Part of the motivation to move to R may also be functional requirements that can only be satisfied with R as compared to other tools. That cannot be subjective "look and feel" characteristics, but more objective statistical methodological advantages. To put some of the drug related costs in perspective, there was just a short letter to the editor in this week's issue of Applied Clinical Trials (an industry publication), which provides some insights into the challenges for drug development and approval, based upon industry outlook research done at Tuft's (http://csdd.tufts.edu/news/complete_story/pr_outlook_2010). The key figures from that study are that is takes over $1 billion U.S. and more than 7 years to take a drug from initial human trials to FDA approval. So based upon those figures alone, even if one spends $10 million per year on SAS licensing costs over the 7 years for a total of $70 million, that is only ~7% of the cost of bringing one big drug to market. Importantly, one has to recognize that the $10 million per year is in reality amortized over a much larger number of trials that are all running at the same time in various phases in the company's drug pipeline. Thus, the real annual software costs attributable to any single drug are far lower as a percentage of the total cost of bringing that drug to market, which further reduces the financial pressure on R&D costs associated with the software alone. Let's also not forget that one big blockbuster drug can bring in $1 billion in revenue per year post-approval so that has to be considered as well. The company will recoup 7 years of clinical trial associated costs for that one drug in 1 year. Those revenues will stay high for a number of years, at least until a generic version of the drug is available and potentially now with any changes in payments under any health care reform activities, at least here in the U.S., that may impact the revenue stream. Just to pick one very large pharma company as an example (without naming it), 2009 revenues were in excess of $20 billion, with net income (profits) in excess of $4 billion. So even if you completely eliminated the $10 million for annual SAS licensing costs, you would have a marginal effect on that company's bottom line. There are bigger fish to fry. The bigger cost savings being realized right now by large pharma, based upon the Tuft's report, is the more aggressive approach to terminating early phase trials when interim evidence becomes available suggesting the lack of viability of the drug. The use of adaptive trial designs are a key part of this change in process. The report shows a decline in recent years of the transition probability from Phase I to Phase II and from Phase II to Phase III. Curiously, the overall success rate of a drug from Phase I to FDA approval has stayed relatively stable at 16%, so there is more to be done here. By terminating unfavorable drug trials earlier, the opportunity to save those costs and re-direct them to more promising drugs is significant. Those costs far outweigh infrastructure costs such as software. In either case, while being advocates of the use of R, we cannot be blind to the business realities in play in this particular environment. That being said, over the 8+ years that I have now been using R, the progress that has been made is nothing short of phenomenal. The growth of the community and the more recent publicity and recognition by SAS, SPSS and other vendors of R's influence are concrete signs of that progress. Importantly, we are seeing the increasing use of R within the FDA and other regulatory bodies, which only serves to further enhance R's position in this domain. I have every confidence that this trend will continue for the foreseeable future. Progress specifically within industry in the human clinical trials arena will be slow and evolutionary as I have noted. It will likely take place incrementally in specific domains and for specific profiles of companies. As comfort with R appreciates, as more statisticians trained in R move from academia to industry and other factors become relevant which in turn give rise to opportunities for R, we will continue to see additional growth in this domain. Regards, Marc Schwartz Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}}