Daniel Malter
2011-Apr-05 16:38 UTC
[R] Precision of summary() when summarizing variables in a data frame
Hi, I summary() a variable with 409908 numeric observations. The variable is part of a data.frame. The problem is that the min and max returned by summary() do not equal the ones returned by min() and max(). Does anybody know why that is?> min(data$vc)[1] 15452> max(data$vc)[1] 316148> summary(data$vc)Min. 1st Qu. Median Mean 3rd Qu. Max. 15450 21670 40980 55500 63880 316100 sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] sqldf_0.3-5 chron_2.3-39 gsubfn_0.5-5 [4] proto_0.3-8 RSQLite.extfuns_0.0.1 RSQLite_0.9-4 [7] DBI_0.2-5 Thanks much, Daniel -- View this message in context: http://r.789695.n4.nabble.com/Precision-of-summary-when-summarizing-variables-in-a-data-frame-tp3428570p3428570.html Sent from the R help mailing list archive at Nabble.com.
jim holtman
2011-Apr-05 16:48 UTC
[R] Precision of summary() when summarizing variables in a data frame
They are probably the same. It isjust that summary is printing out 4 significant digits. Try: options(digits = 20) On Tue, Apr 5, 2011 at 12:38 PM, Daniel Malter <daniel at umd.edu> wrote:> Hi, > > I summary() a variable with 409908 numeric observations. The variable is > part of a data.frame. The problem is that the min and max returned by > summary() do not equal the ones returned by min() and max(). Does anybody > know why that is? > >> min(data$vc) > [1] 15452 >> max(data$vc) > [1] 316148 >> summary(data$vc) > ? Min. 1st Qu. ?Median ? ?Mean 3rd Qu. ? ?Max. > ?15450 ? 21670 ? 40980 ? 55500 ? 63880 ?316100 > > > sessionInfo() > R version 2.11.1 (2010-05-31) > x86_64-apple-darwin9.8.0 > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base > > other attached packages: > [1] sqldf_0.3-5 ? ? ? ? ? chron_2.3-39 ? ? ? ? ?gsubfn_0.5-5 > [4] proto_0.3-8 ? ? ? ? ? RSQLite.extfuns_0.0.1 RSQLite_0.9-4 > [7] DBI_0.2-5 > > Thanks much, > Daniel > > -- > View this message in context: http://r.789695.n4.nabble.com/Precision-of-summary-when-summarizing-variables-in-a-data-frame-tp3428570p3428570.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve?
Erik Iverson
2011-Apr-05 18:53 UTC
[R] Precision of summary() when summarizing variables in a data frame
jim holtman wrote:> They are probably the same. It isjust that summary is printing out 4 > significant digits. Try: > > options(digits = 20)FYI, the default summary method also has its own digits argument.> > > > On Tue, Apr 5, 2011 at 12:38 PM, Daniel Malter <daniel at umd.edu> wrote: >> Hi, >> >> I summary() a variable with 409908 numeric observations. The variable is >> part of a data.frame. The problem is that the min and max returned by >> summary() do not equal the ones returned by min() and max(). Does anybody >> know why that is? >> >>> min(data$vc) >> [1] 15452 >>> max(data$vc) >> [1] 316148 >>> summary(data$vc) >> Min. 1st Qu. Median Mean 3rd Qu. Max. >> 15450 21670 40980 55500 63880 316100 >> >> >> sessionInfo() >> R version 2.11.1 (2010-05-31) >> x86_64-apple-darwin9.8.0 >> >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] sqldf_0.3-5 chron_2.3-39 gsubfn_0.5-5 >> [4] proto_0.3-8 RSQLite.extfuns_0.0.1 RSQLite_0.9-4 >> [7] DBI_0.2-5 >> >> Thanks much, >> Daniel >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/Precision-of-summary-when-summarizing-variables-in-a-data-frame-tp3428570p3428570.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > >