anthony.rossini at novartis.com
2007-May-09  14:26 UTC
[R] Unit Testing Frameworks: summary and brief discussion
Greetings - I'm finally finished review, here's what I heard: ============ from Tobias Verbeke: anthony.rossini@novartis.com wrote:> Greetings! > > After a quick look at current programming tools, especially with regards> to unit-testing frameworks, I've started looking at both "butler" and > "RUnit". I would be grateful to receieve real world development > experience and opinions with either/both. Please send to me directly > (yes, this IS my work email), I will summarize (named or anonymous, as > contributers desire) to the list. >I'm founding member of an R Competence Center at an international consulting company delivering R services mainly to the financial and pharmaceutical industries. Unit testing is central to our development methodology and we've been systematically using RUnit with great satisfaction, mainly because of its simplicity. The presentation of test reports is basic, though. Experiences concerning interaction with the RUnit developers are very positive: gentle and responsive people. We've never used butler. I think it is not actively developed (even if the developer is very active). It should be said that many of our developers (including myself) have backgrounds in statistics (more than in cs or software engineering) and are not always acquainted with the functionality in other unit testing frameworks and the way they integrate in IDEs as is common in these other languages. I'll soon be personally working with a JUnit guru and will take the opportunity to benchmark RUnit/ESS/emacs against his toolkit (Eclipse with JUnit- and other plugins, working `in perfect harmony' (his words)). Even if in my opinion the philosophy of test-driven development is much more important than the tools used, it is useful to question them from time to time and your message reminded me of this... I'll keep you posted if it interests you. Why not work out an evaluation grid / check list for unit testing frameworks ? Totally unrelated to the former, it might be interesting to ask oneself how ESS could be extended to ease unit testing: after refactoring a function some M-x ess-unit-test-function automagically launches the unit-test for this particular function (based on the test function naming scheme), opens a *test report* buffer etc. Kind regards, Tobias ============ from Tony Plate: Hi, I've been looking at testing frameworks for R too, so I'm interested to hear of your experiences & perspective. Here's my own experiences & perspective: The requirements are: (1) it should be very easy to construct and maintain tests (2) it should be easy to run tests, both automatically and manually (3) it should be simple to look at test results and know what went wrong where I've been using a homegrown testing framework for S-PLUS that is loosely based on the R transcript style tests (run *.R and compare output with *.Rout.save in 'tests' dir). There are two differences between this test framework and the standard R one: (1) the output to match and the input commands are generated from an annotated transcript (annotations can switch some tests in or out depending on the version used) (2) annotations can include text substitutions (regular expression style) to be made on the output before attempting to match (this helps make it easier to construct tests that will match across different versions that might have minor cosmetic differences in how output is formatted). We use this test framework for both unit-style tests and system testing (where multiple libraries interact and also call the database). One very nice aspect of this framework is that it is easy to construct tests -- just cut and paste from a command window. Many tests can be generated very quickly this way (my impression is that is is much much faster to build tests by cutting and pasting transcripts from a command window than it is to build tests that use functions like all.equal() to compare data structures.) It is also easy to maintain tests in the face of change (e.g., with a new version of S-PLUS or with bug fixes to functions or with changed database contents) -- I use ediff in emacs to compare test output with the stored annotated transcript and can usually just use ediff commands to update the transcript. This has worked well for us and now we are looking at porting some code to R. I've not seen anything that offers these conveniences in R. It wouldn't be too difficult to add these features to the built-in R testing framework, but I've not had success in getting anyone in R core to listen to even consider changes, so I've not pursued that route after an initial offer of some simple patches to tests.mk and wintests.mk. RUnit doesn't have transcript-style tests, but it wasn't very difficult to add support for transcript-style tests to it. I'll probably go ahead and use some version of that for our porting project. (And offer it to the community if the RUnit maintainers want to incorporate it.) I also like the idea that RUnit has some code analysis tools -- that might support some future project that allowed one to catalogue the number of times each code path through a function was exercised by the tests. I just looked at 'butler' and it looks very much like RUnit to me -- and I didn't see any overview that explained differences. Do you know of any differences? cheers, Tony Plate ============== from Paul Gilbert: Tony While this is not exactly your question, I have been using my own system based on make and the tools use by R CMD build/check to do something I think of as unit testing. This pre-dates the unit-testing frameworks, in fact, some of it predates R. I actually wrote something on this at one point: Paul Gilbert. R package maintenance. R News, 4(2):21-24, September 2004. I have occasionally thought about trying to use RUnit, but never done much because I am relatively happy with what I have. (Inertia is an issue too.) I would be happy to hear if you do an assessment of the various tools. Best, Paul Gilbert ============= From Seth Falcon: Hi Tony, anthony.rossini@novartis.com writes:> After a quick look at current programming tools, especially with regards> to unit-testing frameworks, I've started looking at both "butler" and > "RUnit". I would be grateful to receieve real world development > experience and opinions with either/both. Please send to me directly > (yes, this IS my work email), I will summarize (named or anonymous, as > contributers desire) to the list.I've been using RUnit and have been quite happy with it. I had not heard of butler until I read your mail (!). RUnit behaves reasonably similarly to other *Unit frameworks and this made it easy to get started with as I have used both JUnit and PyUnit (unittest module). Two things to be wary of: 1. At last check, you cannot create classes in unit test code and this makes it difficult to test some types of functionality. I'm really not sure to what extent this is RUnit's fault as opposed to limitation of the S4 implemenation in R. 2. They have chosen a non-default RNG, but recent versions provide a way to override this. This provided for some difficult bug hunting when unit tests behaved differently than hand-run code even with set.seed(). The maintainer has been receptive to feedback and patches. You can look at the not-so-beautiful scripts and such we are using if you look at inst/UnitTest in: Category, GOstats, Biobase, graph Best Wishes, + seth =================== Discussion: After a bit more cursory use, it looks like RUnit is probably the right approach at this time (sorry Hadley!). Both RUnit and butler have a range of testing facilities and programming support tools. I support the above statements about feasibility and problems -- except that I didn't get a chance to checkout the S4 issues that Seth raised above. The one piece that I found missing in my version was some form of GUI based tester, i.e. push a button and test, but I think I've not thought through some of the issues with environments and closures yet that might cause problems. Thanks to everyone for responses! It's clear that there is a good start here, but lots of room for improvement exists. Best regards / Mit freundlichen GrĂ¼ssen, Anthony (Tony) Rossini Novartis Pharma AG MODELING & SIMULATION Group Head a.i., EU Statistical Modeling CHBS, WSJ-027.1.012 Novartis Pharma AG Lichtstrasse 35 CH-4056 Basel Switzerland Phone: +41 61 324 4186 Fax: +41 61 324 3039 Cell: +41 79 367 4557 Email : anthony.rossini@novartis.com [[alternative HTML version deleted]]
ml-r-help at epigenomics.com
2007-May-09  15:58 UTC
[R] Unit Testing Frameworks: summary and brief discussion
anthony.rossini at novartis.com wrote:> Greetings - > > I'm finally finished review, here's what I heard:Hi Anthony, sorry for replying late. I'd like to chip in a brief experience report for our company. We have used RUnit since 2003 starting with R 1.6.2 for our R software development. Since then it has been used for development for over a dozen packages with ~ 3k unit tests. Making use of S4 classes and methods with clear design contracts made its application even more fruitful. To automate the test process we utilized Linux tools to tailor a build and test system to check our R packages with previous, current and development versions of R, CRAN and BioC packages to guarantee backward compatibility as far as possible whilst adapting to changes. Over time the main benefits have been - fearless refactoring of major building blocks of our class hierarchy - early detection of and adaptation to changes in new R versions - data workflow integration testing starting with some data warehouse query initiated from R throughout to generated analysis reports using sweave or similar report generators. With this changes in the warehouse, R, some CRAN R package, our code or the report templates could be spotted and fixed well before any time critical analysis was due rewarding the additional effort to write and maintain the tests. Best regards, Matthias> ============ from Tobias Verbeke: > > anthony.rossini at novartis.com wrote: >> Greetings! >> >> After a quick look at current programming tools, especially with regards > >> to unit-testing frameworks, I've started looking at both "butler" and >> "RUnit". I would be grateful to receieve real world development >> experience and opinions with either/both. Please send to me directly >> (yes, this IS my work email), I will summarize (named or anonymous, as >> contributers desire) to the list. >> > I'm founding member of an R Competence Center at an international > consulting company delivering R services > mainly to the financial and pharmaceutical industries. Unit testing is > central to our development methodology > and we've been systematically using RUnit with great satisfaction, > mainly because of its simplicity. The > presentation of test reports is basic, though. Experiences concerning > interaction with the RUnit developers > are very positive: gentle and responsive people. > > We've never used butler. I think it is not actively developed (even if > the developer is very active). > > It should be said that many of our developers (including myself) have > backgrounds in statistics (more than in cs > or software engineering) and are not always acquainted with the > functionality in other unit testing frameworks > and the way they integrate in IDEs as is common in these other languages. > > I'll soon be personally working with a JUnit guru and will take the > opportunity to benchmark RUnit/ESS/emacs against > his toolkit (Eclipse with JUnit- and other plugins, working `in perfect > harmony' (his words)). Even if in my opinion the > philosophy of test-driven development is much more important than the > tools used, it is useful to question them from > time to time and your message reminded me of this... I'll keep you > posted if it interests you. Why not work out an > evaluation grid / check list for unit testing frameworks ? > > Totally unrelated to the former, it might be interesting to ask oneself > how ESS could be extended to ease unit testing: > after refactoring a function some M-x ess-unit-test-function > automagically launches the unit-test for this particular > function (based on the test function naming scheme), opens a *test > report* buffer etc. > > Kind regards, > Tobias > > ============ from Tony Plate: > > Hi, I've been looking at testing frameworks for R too, so I'm interested > to hear of your experiences & perspective. > > Here's my own experiences & perspective: > The requirements are: > > (1) it should be very easy to construct and maintain tests > (2) it should be easy to run tests, both automatically and manually > (3) it should be simple to look at test results and know what went wrong > where > > I've been using a homegrown testing framework for S-PLUS that is loosely > based on the R transcript style tests (run *.R and compare output with > *.Rout.save in 'tests' dir). There are two differences between this > test framework and the standard R one: > (1) the output to match and the input commands are generated from an > annotated transcript (annotations can switch some tests in or out > depending on the version used) > (2) annotations can include text substitutions (regular expression > style) to be made on the output before attempting to match (this helps > make it easier to construct tests that will match across different > versions that might have minor cosmetic differences in how output is > formatted). > > We use this test framework for both unit-style tests and system testing > (where multiple libraries interact and also call the database). > One very nice aspect of this framework is that it is easy to construct > tests -- just cut and paste from a command window. Many tests can be > generated very quickly this way (my impression is that is is much much > faster to build tests by cutting and pasting transcripts from a command > window than it is to build tests that use functions like all.equal() to > compare data structures.) It is also easy to maintain tests in the face > of change (e.g., with a new version of S-PLUS or with bug fixes to > functions or with changed database contents) -- I use ediff in emacs to > compare test output with the stored annotated transcript and can usually > just use ediff commands to update the transcript. > > This has worked well for us and now we are looking at porting some code > to R. I've not seen anything that offers these conveniences in R. > > It wouldn't be too difficult to add these features to the built-in R > testing framework, but I've not had success in getting anyone in R core > to listen to even consider changes, so I've not pursued that route after > an initial offer of some simple patches to tests.mk and wintests.mk. > > RUnit doesn't have transcript-style tests, but it wasn't very difficult > to add support for transcript-style tests to it. I'll probably go ahead > and use some version of that for our porting project. (And offer it to > the community if the RUnit maintainers want to incorporate it.) I also > like the idea that RUnit has some code analysis tools -- that might > support some future project that allowed one to catalogue the number of > times each code path through a function was exercised by the tests. > > I just looked at 'butler' and it looks very much like RUnit to me -- and > I didn't see any overview that explained differences. Do you know of > any differences? > > cheers, > > Tony Plate > > > ============== from Paul Gilbert: > > Tony > > While this is not exactly your question, I have been using my own system > based on make and the tools use by R CMD build/check to do something I > think of as unit testing. This pre-dates the unit-testing frameworks, in > fact, some of it predates R. I actually wrote something on this at one > point: Paul Gilbert. R package maintenance. R News, 4(2):21-24, > September 2004. > > I have occasionally thought about trying to use RUnit, but never done > much because I am relatively happy with what I have. (Inertia is an > issue too.) I would be happy to hear if you do an assessment of the > various tools. > > Best, > Paul Gilbert > > > ============= From Seth Falcon: > > Hi Tony, > > anthony.rossini at novartis.com writes: >> After a quick look at current programming tools, especially with regards > >> to unit-testing frameworks, I've started looking at both "butler" and >> "RUnit". I would be grateful to receieve real world development >> experience and opinions with either/both. Please send to me directly >> (yes, this IS my work email), I will summarize (named or anonymous, as >> contributers desire) to the list. > > I've been using RUnit and have been quite happy with it. I had not > heard of butler until I read your mail (!). > > RUnit behaves reasonably similarly to other *Unit frameworks and this > made it easy to get started with as I have used both JUnit and PyUnit > (unittest module). > > Two things to be wary of: > > 1. At last check, you cannot create classes in unit test code and > this makes it difficult to test some types of functionality. I'm > really not sure to what extent this is RUnit's fault as opposed > to limitation of the S4 implemenation in R. > > 2. They have chosen a non-default RNG, but recent versions provide a > way to override this. This provided for some difficult bug > hunting when unit tests behaved differently than hand-run code > even with set.seed(). > > The maintainer has been receptive to feedback and patches. You can > look at the not-so-beautiful scripts and such we are using if you look > at inst/UnitTest in: Category, GOstats, Biobase, graph > > Best Wishes, > > + seth > > > =================== Discussion: > > After a bit more cursory use, it looks like RUnit is probably the right > approach at this time (sorry Hadley!). Both RUnit and butler have a > range of testing facilities and programming support tools. I support the > above statements about feasibility and problems -- except that I didn't > get a chance to checkout the S4 issues that Seth raised above. The one > piece that I found missing in my version was some form of GUI based > tester, i.e. push a button and test, but I think I've not thought through > some of the issues with environments and closures yet that might cause > problems. > > Thanks to everyone for responses! It's clear that there is a good start > here, but lots of room for improvement exists. > > Best regards / Mit freundlichen Gr?ssen, > Anthony (Tony) Rossini > Novartis Pharma AG > MODELING & SIMULATION > Group Head a.i., EU Statistical Modeling > CHBS, WSJ-027.1.012 > Novartis Pharma AG > Lichtstrasse 35 > CH-4056 Basel > Switzerland > Phone: +41 61 324 4186 > Fax: +41 61 324 3039 > Cell: +41 79 367 4557 > Email : anthony.rossini at novartis.com > > [[alternative HTML version deleted]] > > > > ------------------------------------------------------------------------ > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Matthias Burger Project Manager/ Biostatistician Epigenomics AG Kleine Praesidentenstr. 1 10178 Berlin, Germany phone:+49-30-24345-371 fax:+49-30-24345-555 http://www.epigenomics.com matthias.burger at epigenomics.com -- Epigenomics AG Berlin Amtsgericht Charlottenburg HRB 75861 Vorstand: Geert Nygaard (CEO/Vorsitzender), Dr. Kurt Berlin (CSO) Oliver Schacht PhD (CFO), Christian Piepenbrock (COO) Aufsichtsrat: Prof. Dr. Dr. hc. Rolf Krebs (Chairman/Vorsitzender)
ml-r-help at epigenomics.com
2007-May-09  16:23 UTC
[R] Unit Testing Frameworks: summary and brief discussion
anthony.rossini at novartis.com wrote: [...]> ============= From Seth Falcon: > > Hi Tony, > > anthony.rossini at novartis.com writes: >> After a quick look at current programming tools, especially with regards > >> to unit-testing frameworks, I've started looking at both "butler" and >> "RUnit". I would be grateful to receieve real world development >> experience and opinions with either/both. Please send to me directly >> (yes, this IS my work email), I will summarize (named or anonymous, as >> contributers desire) to the list. > > I've been using RUnit and have been quite happy with it. I had not > heard of butler until I read your mail (!). > > RUnit behaves reasonably similarly to other *Unit frameworks and this > made it easy to get started with as I have used both JUnit and PyUnit > (unittest module). > > Two things to be wary of: > > 1. At last check, you cannot create classes in unit test code and > this makes it difficult to test some types of functionality. I'm > really not sure to what extent this is RUnit's fault as opposed > to limitation of the S4 implemenation in R.I'd be very interested to hear what problems you experienced. If you have any example ready I'd be happy to take a look at it. So far we have not observed (severe) problems to create S4 classes and test them in unit test code. We actually use RUnit mainly on S4 classes and methods. There are even some very simple checks in RUnits own test cases which create and use S4 classes. For example in tests/runitRunit.r in the source package.> 2. They have chosen a non-default RNG, but recent versions provide a > way to override this. This provided for some difficult bug > hunting when unit tests behaved differently than hand-run code > even with set.seed(). > > The maintainer has been receptive to feedback and patches. You can > look at the not-so-beautiful scripts and such we are using if you look > at inst/UnitTest in: Category, GOstats, Biobase, graph > > Best Wishes, > > + seth >[...] Best, Matthias -- Matthias Burger Project Manager/ Biostatistician Epigenomics AG Kleine Praesidentenstr. 1 10178 Berlin, Germany phone:+49-30-24345-371 fax:+49-30-24345-555 http://www.epigenomics.com matthias.burger at epigenomics.com -- Epigenomics AG Berlin Amtsgericht Charlottenburg HRB 75861 Vorstand: Geert Nygaard (CEO/Vorsitzender), Dr. Kurt Berlin (CSO) Oliver Schacht PhD (CFO), Christian Piepenbrock (COO) Aufsichtsrat: Prof. Dr. Dr. hc. Rolf Krebs (Chairman/Vorsitzender)
Paul Gilbert
2007-May-10  01:18 UTC
[R] Unit Testing Frameworks: summary and brief discussion
Tony Thanks for the summary. My ad hoc system is pretty good for catching flagged errors, and numerical errors when I have a check. Could you (or someone else) comment on how easy it would be with one of these more formal frameworks to do three things I have not been able to accomplish easily: - My code gives error and warning messages in some situations. I want to test that the errors and warnings work, but these flags are the correct response to the test. In fact, it is an error if I don't get the flag. How easy is it to set up automatic tests to check warning and error messages work? - For some things it is the printed format that matters. How easy is it to set up a test of the printed output? (Something like the Rout files used in R CMD check.) I think this is what Tony Plate is calling transcript file tests, and I guess it is not automatically available. I am not really interested in something I would have to change with each new release of R, and I need it to work cross-platform. I want to know when something has changed, in R or my own code, without having to examine the output carefully. - (And now the hard one.) For some things it is the plotted output that matters. Is it possible to set up automatic tests of plotting? I can already test that plots run. I want to know if they "look very different". And no, I don't have a clue where to start on this one. Paul Gilbert anthony.rossini at novartis.com wrote:> Greetings - > > I'm finally finished review, here's what I heard: > > ============ from Tobias Verbeke: > > anthony.rossini at novartis.com wrote: >> Greetings! >> >> After a quick look at current programming tools, especially with regards > >> to unit-testing frameworks, I've started looking at both "butler" and >> "RUnit". I would be grateful to receieve real world development >> experience and opinions with either/both. Please send to me directly >> (yes, this IS my work email), I will summarize (named or anonymous, as >> contributers desire) to the list. >> > I'm founding member of an R Competence Center at an international > consulting company delivering R services > mainly to the financial and pharmaceutical industries. Unit testing is > central to our development methodology > and we've been systematically using RUnit with great satisfaction, > mainly because of its simplicity. The > presentation of test reports is basic, though. Experiences concerning > interaction with the RUnit developers > are very positive: gentle and responsive people. > > We've never used butler. I think it is not actively developed (even if > the developer is very active). > > It should be said that many of our developers (including myself) have > backgrounds in statistics (more than in cs > or software engineering) and are not always acquainted with the > functionality in other unit testing frameworks > and the way they integrate in IDEs as is common in these other languages. > > I'll soon be personally working with a JUnit guru and will take the > opportunity to benchmark RUnit/ESS/emacs against > his toolkit (Eclipse with JUnit- and other plugins, working `in perfect > harmony' (his words)). Even if in my opinion the > philosophy of test-driven development is much more important than the > tools used, it is useful to question them from > time to time and your message reminded me of this... I'll keep you > posted if it interests you. Why not work out an > evaluation grid / check list for unit testing frameworks ? > > Totally unrelated to the former, it might be interesting to ask oneself > how ESS could be extended to ease unit testing: > after refactoring a function some M-x ess-unit-test-function > automagically launches the unit-test for this particular > function (based on the test function naming scheme), opens a *test > report* buffer etc. > > Kind regards, > Tobias > > ============ from Tony Plate: > > Hi, I've been looking at testing frameworks for R too, so I'm interested > to hear of your experiences & perspective. > > Here's my own experiences & perspective: > The requirements are: > > (1) it should be very easy to construct and maintain tests > (2) it should be easy to run tests, both automatically and manually > (3) it should be simple to look at test results and know what went wrong > where > > I've been using a homegrown testing framework for S-PLUS that is loosely > based on the R transcript style tests (run *.R and compare output with > *.Rout.save in 'tests' dir). There are two differences between this > test framework and the standard R one: > (1) the output to match and the input commands are generated from an > annotated transcript (annotations can switch some tests in or out > depending on the version used) > (2) annotations can include text substitutions (regular expression > style) to be made on the output before attempting to match (this helps > make it easier to construct tests that will match across different > versions that might have minor cosmetic differences in how output is > formatted). > > We use this test framework for both unit-style tests and system testing > (where multiple libraries interact and also call the database). > One very nice aspect of this framework is that it is easy to construct > tests -- just cut and paste from a command window. Many tests can be > generated very quickly this way (my impression is that is is much much > faster to build tests by cutting and pasting transcripts from a command > window than it is to build tests that use functions like all.equal() to > compare data structures.) It is also easy to maintain tests in the face > of change (e.g., with a new version of S-PLUS or with bug fixes to > functions or with changed database contents) -- I use ediff in emacs to > compare test output with the stored annotated transcript and can usually > just use ediff commands to update the transcript. > > This has worked well for us and now we are looking at porting some code > to R. I've not seen anything that offers these conveniences in R. > > It wouldn't be too difficult to add these features to the built-in R > testing framework, but I've not had success in getting anyone in R core > to listen to even consider changes, so I've not pursued that route after > an initial offer of some simple patches to tests.mk and wintests.mk. > > RUnit doesn't have transcript-style tests, but it wasn't very difficult > to add support for transcript-style tests to it. I'll probably go ahead > and use some version of that for our porting project. (And offer it to > the community if the RUnit maintainers want to incorporate it.) I also > like the idea that RUnit has some code analysis tools -- that might > support some future project that allowed one to catalogue the number of > times each code path through a function was exercised by the tests. > > I just looked at 'butler' and it looks very much like RUnit to me -- and > I didn't see any overview that explained differences. Do you know of > any differences? > > cheers, > > Tony Plate > > > ============== from Paul Gilbert: > > Tony > > While this is not exactly your question, I have been using my own system > based on make and the tools use by R CMD build/check to do something I > think of as unit testing. This pre-dates the unit-testing frameworks, in > fact, some of it predates R. I actually wrote something on this at one > point: Paul Gilbert. R package maintenance. R News, 4(2):21-24, > September 2004. > > I have occasionally thought about trying to use RUnit, but never done > much because I am relatively happy with what I have. (Inertia is an > issue too.) I would be happy to hear if you do an assessment of the > various tools. > > Best, > Paul Gilbert > > > ============= From Seth Falcon: > > Hi Tony, > > anthony.rossini at novartis.com writes: >> After a quick look at current programming tools, especially with regards > >> to unit-testing frameworks, I've started looking at both "butler" and >> "RUnit". I would be grateful to receieve real world development >> experience and opinions with either/both. Please send to me directly >> (yes, this IS my work email), I will summarize (named or anonymous, as >> contributers desire) to the list. > > I've been using RUnit and have been quite happy with it. I had not > heard of butler until I read your mail (!). > > RUnit behaves reasonably similarly to other *Unit frameworks and this > made it easy to get started with as I have used both JUnit and PyUnit > (unittest module). > > Two things to be wary of: > > 1. At last check, you cannot create classes in unit test code and > this makes it difficult to test some types of functionality. I'm > really not sure to what extent this is RUnit's fault as opposed > to limitation of the S4 implemenation in R. > > 2. They have chosen a non-default RNG, but recent versions provide a > way to override this. This provided for some difficult bug > hunting when unit tests behaved differently than hand-run code > even with set.seed(). > > The maintainer has been receptive to feedback and patches. You can > look at the not-so-beautiful scripts and such we are using if you look > at inst/UnitTest in: Category, GOstats, Biobase, graph > > Best Wishes, > > + seth > > > =================== Discussion: > > After a bit more cursory use, it looks like RUnit is probably the right > approach at this time (sorry Hadley!). Both RUnit and butler have a > range of testing facilities and programming support tools. I support the > above statements about feasibility and problems -- except that I didn't > get a chance to checkout the S4 issues that Seth raised above. The one > piece that I found missing in my version was some form of GUI based > tester, i.e. push a button and test, but I think I've not thought through > some of the issues with environments and closures yet that might cause > problems. > > Thanks to everyone for responses! It's clear that there is a good start > here, but lots of room for improvement exists. > > Best regards / Mit freundlichen Gr?ssen, > Anthony (Tony) Rossini > Novartis Pharma AG > MODELING & SIMULATION > Group Head a.i., EU Statistical Modeling > CHBS, WSJ-027.1.012 > Novartis Pharma AG > Lichtstrasse 35 > CH-4056 Basel > Switzerland > Phone: +41 61 324 4186 > Fax: +41 61 324 3039 > Cell: +41 79 367 4557 > Email : anthony.rossini at novartis.com > > [[alternative HTML version deleted]] > > > > ------------------------------------------------------------------------ > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.=================================================================================== La version fran?aise suit le texte anglais. ------------------------------------------------------------------------------------ This email may contain privileged and/or confidential inform...{{dropped}}
hadley wickham
2007-May-10  09:07 UTC
[R] Unit Testing Frameworks: summary and brief discussion
Yes, butler is pretty much abandoned. I didn't realise that runit existed when I first wrote it, so much of the functionality is probably implemented better there (with maintainers that are actually doing something with the package). That said, the purpose of butler wasn't only for unit testing, it also contains some code (which doesn't duplicate functionality available elsewhere, to the best of my knowledge) for benchmarking (benchmark) and profiling (stopwatch). The profiling code formats the output of Rprof in a slightly different way to the default, and also has a graphical display. There is to be a natural connection between profiling and tree based partitioning, so I hope one day to implement a visualisation, based on the ideas of Klimt, that makes it easier to explore profiling data. Hadley On 5/9/07, anthony.rossini at novartis.com <anthony.rossini at novartis.com> wrote:> Greetings - > > I'm finally finished review, here's what I heard: > > ============ from Tobias Verbeke: > > anthony.rossini at novartis.com wrote: > > Greetings! > > > > After a quick look at current programming tools, especially with regards > > > to unit-testing frameworks, I've started looking at both "butler" and > > "RUnit". I would be grateful to receieve real world development > > experience and opinions with either/both. Please send to me directly > > (yes, this IS my work email), I will summarize (named or anonymous, as > > contributers desire) to the list. > > > I'm founding member of an R Competence Center at an international > consulting company delivering R services > mainly to the financial and pharmaceutical industries. Unit testing is > central to our development methodology > and we've been systematically using RUnit with great satisfaction, > mainly because of its simplicity. The > presentation of test reports is basic, though. Experiences concerning > interaction with the RUnit developers > are very positive: gentle and responsive people. > > We've never used butler. I think it is not actively developed (even if > the developer is very active). > > It should be said that many of our developers (including myself) have > backgrounds in statistics (more than in cs > or software engineering) and are not always acquainted with the > functionality in other unit testing frameworks > and the way they integrate in IDEs as is common in these other languages. > > I'll soon be personally working with a JUnit guru and will take the > opportunity to benchmark RUnit/ESS/emacs against > his toolkit (Eclipse with JUnit- and other plugins, working `in perfect > harmony' (his words)). Even if in my opinion the > philosophy of test-driven development is much more important than the > tools used, it is useful to question them from > time to time and your message reminded me of this... I'll keep you > posted if it interests you. Why not work out an > evaluation grid / check list for unit testing frameworks ? > > Totally unrelated to the former, it might be interesting to ask oneself > how ESS could be extended to ease unit testing: > after refactoring a function some M-x ess-unit-test-function > automagically launches the unit-test for this particular > function (based on the test function naming scheme), opens a *test > report* buffer etc. > > Kind regards, > Tobias > > ============ from Tony Plate: > > Hi, I've been looking at testing frameworks for R too, so I'm interested > to hear of your experiences & perspective. > > Here's my own experiences & perspective: > The requirements are: > > (1) it should be very easy to construct and maintain tests > (2) it should be easy to run tests, both automatically and manually > (3) it should be simple to look at test results and know what went wrong > where > > I've been using a homegrown testing framework for S-PLUS that is loosely > based on the R transcript style tests (run *.R and compare output with > *.Rout.save in 'tests' dir). There are two differences between this > test framework and the standard R one: > (1) the output to match and the input commands are generated from an > annotated transcript (annotations can switch some tests in or out > depending on the version used) > (2) annotations can include text substitutions (regular expression > style) to be made on the output before attempting to match (this helps > make it easier to construct tests that will match across different > versions that might have minor cosmetic differences in how output is > formatted). > > We use this test framework for both unit-style tests and system testing > (where multiple libraries interact and also call the database). > One very nice aspect of this framework is that it is easy to construct > tests -- just cut and paste from a command window. Many tests can be > generated very quickly this way (my impression is that is is much much > faster to build tests by cutting and pasting transcripts from a command > window than it is to build tests that use functions like all.equal() to > compare data structures.) It is also easy to maintain tests in the face > of change (e.g., with a new version of S-PLUS or with bug fixes to > functions or with changed database contents) -- I use ediff in emacs to > compare test output with the stored annotated transcript and can usually > just use ediff commands to update the transcript. > > This has worked well for us and now we are looking at porting some code > to R. I've not seen anything that offers these conveniences in R. > > It wouldn't be too difficult to add these features to the built-in R > testing framework, but I've not had success in getting anyone in R core > to listen to even consider changes, so I've not pursued that route after > an initial offer of some simple patches to tests.mk and wintests.mk. > > RUnit doesn't have transcript-style tests, but it wasn't very difficult > to add support for transcript-style tests to it. I'll probably go ahead > and use some version of that for our porting project. (And offer it to > the community if the RUnit maintainers want to incorporate it.) I also > like the idea that RUnit has some code analysis tools -- that might > support some future project that allowed one to catalogue the number of > times each code path through a function was exercised by the tests. > > I just looked at 'butler' and it looks very much like RUnit to me -- and > I didn't see any overview that explained differences. Do you know of > any differences? > > cheers, > > Tony Plate > > > ============== from Paul Gilbert: > > Tony > > While this is not exactly your question, I have been using my own system > based on make and the tools use by R CMD build/check to do something I > think of as unit testing. This pre-dates the unit-testing frameworks, in > fact, some of it predates R. I actually wrote something on this at one > point: Paul Gilbert. R package maintenance. R News, 4(2):21-24, > September 2004. > > I have occasionally thought about trying to use RUnit, but never done > much because I am relatively happy with what I have. (Inertia is an > issue too.) I would be happy to hear if you do an assessment of the > various tools. > > Best, > Paul Gilbert > > > ============= From Seth Falcon: > > Hi Tony, > > anthony.rossini at novartis.com writes: > > After a quick look at current programming tools, especially with regards > > > to unit-testing frameworks, I've started looking at both "butler" and > > "RUnit". I would be grateful to receieve real world development > > experience and opinions with either/both. Please send to me directly > > (yes, this IS my work email), I will summarize (named or anonymous, as > > contributers desire) to the list. > > I've been using RUnit and have been quite happy with it. I had not > heard of butler until I read your mail (!). > > RUnit behaves reasonably similarly to other *Unit frameworks and this > made it easy to get started with as I have used both JUnit and PyUnit > (unittest module). > > Two things to be wary of: > > 1. At last check, you cannot create classes in unit test code and > this makes it difficult to test some types of functionality. I'm > really not sure to what extent this is RUnit's fault as opposed > to limitation of the S4 implemenation in R. > > 2. They have chosen a non-default RNG, but recent versions provide a > way to override this. This provided for some difficult bug > hunting when unit tests behaved differently than hand-run code > even with set.seed(). > > The maintainer has been receptive to feedback and patches. You can > look at the not-so-beautiful scripts and such we are using if you look > at inst/UnitTest in: Category, GOstats, Biobase, graph > > Best Wishes, > > + seth > > > =================== Discussion: > > After a bit more cursory use, it looks like RUnit is probably the right > approach at this time (sorry Hadley!). Both RUnit and butler have a > range of testing facilities and programming support tools. I support the > above statements about feasibility and problems -- except that I didn't > get a chance to checkout the S4 issues that Seth raised above. The one > piece that I found missing in my version was some form of GUI based > tester, i.e. push a button and test, but I think I've not thought through > some of the issues with environments and closures yet that might cause > problems. > > Thanks to everyone for responses! It's clear that there is a good start > here, but lots of room for improvement exists. > > Best regards / Mit freundlichen Gr?ssen, > Anthony (Tony) Rossini > Novartis Pharma AG > MODELING & SIMULATION > Group Head a.i., EU Statistical Modeling > CHBS, WSJ-027.1.012 > Novartis Pharma AG > Lichtstrasse 35 > CH-4056 Basel > Switzerland > Phone: +41 61 324 4186 > Fax: +41 61 324 3039 > Cell: +41 79 367 4557 > Email : anthony.rossini at novartis.com > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >