Dirk Eddelbuettel
2016-Aug-19 16:40 UTC
[Rd] summary.default rounding on numeric seems inconsistent with other R behaviors
It is the old story of defined behaviour and expected outcomes. Hard to change now. So I would suggest you do something like this in your ~/.Rprofile: R> smry <- function(...) summary(..., digits=6) R> smry(155555L) Min. 1st Qu. Median Mean 3rd Qu. Max. 155555 155555 155555 155555 155555 155555 R> Maybe call it Summary() instead. Dirk -- http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
Martin Maechler
2016-Aug-23 12:33 UTC
[Rd] summary.default rounding on numeric seems inconsistent with other R behaviors
>>>>> Dirk Eddelbuettel <edd at debian.org> >>>>> on Fri, 19 Aug 2016 11:40:05 -0500 writes:> It is the old story of defined behaviour and expected outcomes. Hard to > change now. yes... not impossible though... see below > So I would suggest you do something like this in your ~/.Rprofile: R> smry <- function(...) summary(..., digits=6) R> smry(155555L) > Min. 1st Qu. Median Mean 3rd Qu. Max. > 155555 155555 155555 155555 155555 155555 R> > Maybe call it Summary() instead. yes, do use a different name. There other such functions, 'summarize()'. Simone wrote> I had raised the matter ten years ago, and I was told that the topic was > already very^3 old > > https://stat.ethz.ch/pipermail/r-devel/2006-September/042684.html > > there is some discussion on its origin and also a declaration of intents to > change the default behaviour, which, unfortunately, remained a declaration. > I agree that R could do better here, let's hope in less than ten years > though. ;-)and the 2006 thread he mentions is basically a similar question and a reply by me that I agreed to some extent that a change was desirable ... originally we had adhered to the S "standard" which became the S+ one and at that time I did still have access to a running instance of S-PLUS 6.2 where I had seen that Insightful (the company selling curating and selling S-PLUS) also had decided to change the ~15 year old S "standard"... and indeed I was implicitly *asking* for proposals of such a change, but I think I never saw a (careful) proposal. In the spirit of probably 99% of other "base R" code, a change should really *not* round __at all__ in the summary() methods, but *only* in the print() methods of such summary() results. OTOH, for back compatibility, if a user does use summary(.., digits=.) explicitly, these digits should be 'obeyed' of course. I think summary(<1-variable>) could easily, and relatively "back-compatibly" be changed in the above vain. One "real problem" is the wrong decision (also from S and S-PLUS times IIRC) to return a "character" matrix for summary(<data.frame>, ..) or summary(<matrix>, ..) (For a data frame, I think it should return a list() of single-variable summary()es, or then a numeric matrix .. in both cases have a good print() method) because when you return a character matrix, all the numbers are already rounded, ... and if we follow the above approach they would have to be rounded further... ``the horror'' I wonder how much code out there is relying on the internal structure of summary(<data.frame>).. because that is the one part I'd definitely want to change, too. Martin
Martin Maechler
2016-Aug-24 09:36 UTC
[Rd] summary.default rounding on numeric seems inconsistent with other R behaviors
>>>>> Martin Maechler <maechler at stat.math.ethz.ch> >>>>> on Tue, 23 Aug 2016 14:33:58 +0200 writes:>>>>> Dirk Eddelbuettel <edd at debian.org> >>>>> on Fri, 19 Aug 2016 11:40:05 -0500 writes:>> It is the old story of defined behaviour and expected outcomes. Hard to >> change now. > yes... not impossible though... see below >> So I would suggest you do something like this in your ~/.Rprofile: R> smry <- function(...) summary(..., digits=6) R> smry(155555L) >> Min. 1st Qu. Median Mean 3rd Qu. Max. >> 155555 155555 155555 155555 155555 155555 R> >> Maybe call it Summary() instead. > yes, do use a different name. There other such functions, 'summarize()'. > Simone wrote >> I had raised the matter ten years ago, and I was told that the topic was >> already very^3 old >> >> https://stat.ethz.ch/pipermail/r-devel/2006-September/042684.html >> >> there is some discussion on its origin and also a declaration of intents to >> change the default behaviour, which, unfortunately, remained a declaration. >> I agree that R could do better here, let's hope in less than ten years >> though. ;-) > and the 2006 thread he mentions is basically a similar question > and a reply by me that I agreed to some extent that a change was > desirable ... originally we had adhered to the S "standard" > which became the S+ one and at that time I did still have access > to a running instance of S-PLUS 6.2 where I had seen that > Insightful (the company selling curating and selling S-PLUS) > also had decided to change the ~15 year old S "standard"... and > indeed I was implicitly *asking* for proposals of such a change, > but I think I never saw a (careful) proposal. > In the spirit of probably 99% of other "base R" code, a change > should really *not* round __at all__ in the summary() methods, > but *only* in the print() methods of such summary() results. > OTOH, for back compatibility, if a user does use summary(.., digits=.) > explicitly, these digits should be 'obeyed' of course. > I think summary(<1-variable>) could easily, and relatively "back-compatibly" > be changed in the above vain. > One "real problem" is the wrong decision (also from S and S-PLUS > times IIRC) to return a "character" matrix for > summary(<data.frame>, ..) > or summary(<matrix>, ..) > (For a data frame, I think it should return a list() of > single-variable summary()es, or then a numeric matrix .. in > both cases have a good print() method) > because when you return a character matrix, all the numbers are > already rounded, ... and if we follow the above approach they > would have to be rounded further... ``the horror'' > I wonder how much code out there is relying on the internal > structure of summary(<data.frame>).. because that is the one > part I'd definitely want to change, too. [Talking to myself .. ;-)] Yes, but that's the tough part to change. This thread's topic is really only about changing summary.default(), and I have started testing such a change now, and that does seem very sensible: - No rounding in summary.default(), but - (almost) back-compatible rounding in its print() method. My current plan is to commit this to R-devel in a day or so, unless unforeseen issues emerge. Martin
Apparently Analagous Threads
- summary.default rounding on numeric seems inconsistent with other R behaviors
- summary.default rounding on numeric seems inconsistent with other R behaviors
- summary.default rounding on numeric seems inconsistent with other R behaviors
- summary.default rounding on numeric seems inconsistent with other R behaviors
- round() seems inconsistent when rounding 5s