thr3ads.net - R devel - [Rd] undesirable rounding off due to 'read.table' (PR#8974) [Jun 2006]

If this information is useful, please help other people find it:
Share via:

overeem at knmi.nl

2006-Jun-13 14:01 UTC

[Rd] undesirable rounding off due to 'read.table' (PR#8974)

Full_Name: Aart Overeem
Version: 2.2.0
OS: Linux
Submission from: (NULL) (145.23.254.155)


Construct a dataframe consisting of several variables by using
'data.frame' and
'cbind' and write it to a file with 'write.table'. The file
consists of headers
and values, such as 12.4283675334551 (so 13 numbers behind the decimal point).
If this dataframe is read with 'read.table(filename, skip = 1)' or
'read.table(filename, header = TRUE') the values only have 7 numbers
behind the
decimal point, e.g. 12.42837. So, the reading rounds off the values. This is not
mentioned in the manual. Although the values still have many numbers behind the
decimal point, rounding off is, in my view, never desirable.

Hin-Tak Leung

2006-Jun-13 14:22 UTC

head link

[Rd] undesirable rounding off due to 'read.table' (PR#8974)

overeem at knmi.nl wrote:> Full_Name: Aart Overeem
> Version: 2.2.0
> OS: Linux
> Submission from: (NULL) (145.23.254.155)
> 
> 
> Construct a dataframe consisting of several variables by using
'data.frame' and
> 'cbind' and write it to a file with 'write.table'. The file
consists of headers
> and values, such as 12.4283675334551 (so 13 numbers behind the decimal
point).
> If this dataframe is read with 'read.table(filename, skip = 1)' or
> 'read.table(filename, header = TRUE') the values only have 7
numbers behind the
> decimal point, e.g. 12.42837. So, the reading rounds off the values. This
is not
> mentioned in the manual. Although the values still have many numbers behind
the
> decimal point, rounding off is, in my view, never desirable.
Hmm, this is probably due to conversion by the scanf family of functions
(I don't know the precise location or mechanism of R doing it, this is a 
guess). It is mentioned in my manpage of sscanf:

        f      Matches an optionally signed floating-point number;
                the next pointer must be a pointer to float.
        e      Equivalent to f.
        g      Equivalent to f.
        E      Equivalent to f.
        a      (C99) Equivalent to f.

So printf/fprintf/sprintf and scanf/sscanf/fscanf are not symmetrical,
and you lose precision from 15 (double) to 7 (float). It is a
generic problem with ANSI C's printf/scanf, not specific to R.

Why don't you use save() or save.image() instead for saving and 
reloading data.frame ? It is *much faster*, you get much smaller file,
and also more accurate. Just my two cents.

HTL

Gavin Simpson

2006-Jun-13 14:38 UTC

head link

[Rd] undesirable rounding off due to 'read.table' (PR#8974)

On Tue, 2006-06-13 at 16:01 +0200, overeem at knmi.nl
wrote:> Full_Name: Aart Overeem
> Version: 2.2.0
You are asked not to report bugs on out-dated versions of R...
> OS: Linux
> Submission from: (NULL) (145.23.254.155)
> 
> 
> Construct a dataframe consisting of several variables by using
'data.frame' and
> 'cbind' and write it to a file with 'write.table'. The file
consists of headers
> and values, such as 12.4283675334551 (so 13 numbers behind the decimal
point).
> If this dataframe is read with 'read.table(filename, skip = 1)' or
> 'read.table(filename, header = TRUE') the values only have 7
numbers behind the
> decimal point, e.g. 12.42837. So, the reading rounds off the values. This
is not
> mentioned in the manual. Although the values still have many numbers behind
the
> decimal point, rounding off is, in my view, never desirable.
Works for me in R 2.3.1 (patched)

Are you mistaking the printed representation of your data.frame for the
real thing. E.g.:

# dummy data
dat <- as.data.frame(matrix(rnorm(100)+ 0.000000000000012, ncol = 10))
# not that reading/writing has anything to do with this, but just to
# prove it
write.table(dat, file = "~/tmp/temp.csv", sep = ",")
dat <- read.table("~/tmp/temp.csv", sep = ",", header =
TRUE)
dat
options(digits = 14)
dat

or

print(dat, digits = 14)

G

Ps. Wasn't sure about the etiquette of replying to R-bugs in recipients,
so deleted it in case this caused further work for the maintainer(s) of
the bug repository. Sorry if this isn't desirable.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
*  Note new Address, Telephone & Fax numbers from 6th April 2006  *
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson
ECRC & ENSIS                  [t] +44 (0)20 7679 0522
UCL Department of Geography   [f] +44 (0)20 7679 0565
Pearson Building              [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street                  [w] http://www.ucl.ac.uk/~ucfagls/cv/
London, UK.                   [w] http://www.ucl.ac.uk/~ucfagls/
WC1E 6BT.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

Duncan Murdoch

2006-Jun-13 14:38 UTC

head link

[Rd] undesirable rounding off due to 'read.table' (PR#8974)

overeem at knmi.nl wrote:> Full_Name: Aart Overeem
> Version: 2.2.0
> OS: Linux
> Submission from: (NULL) (145.23.254.155)
>
>
> Construct a dataframe consisting of several variables by using
'data.frame' and
> 'cbind' and write it to a file with 'write.table'. The file
consists of headers
> and values, such as 12.4283675334551 (so 13 numbers behind the decimal
point).
> If this dataframe is read with 'read.table(filename, skip = 1)' or
> 'read.table(filename, header = TRUE') the values only have 7
numbers behind the
> decimal point, e.g. 12.42837. So, the reading rounds off the values. This
is not
> mentioned in the manual. Although the values still have many numbers behind
the
> decimal point, rounding off is, in my view, never desirable.I don't see this.  Try the following script:

 > x <- data.frame(a=12.4283675334551)
 > x
         a
1 12.42837
 > write.table(x,'test')
 > y <- read.table('test')
 > y
         a
1 12.42837
 > y$a-x$a
[1] 0

If y$a had been rounded to 5 decimal places, then we would see a nonzero 
difference at the end.  I think you are being confused by the display, 
which only shows 5 decimal places, even though more are

maintained internally.  For example,

 > options(digits=20)
 > y
                 a
1 12.4283675334551


Please be careful about what you submit as a bug report.

Duncan Murdoch

ripley at stats.ox.ac.uk

2006-Jun-13 14:51 UTC

head link

[Rd] undesirable rounding off due to 'read.table' (PR#8974)

I believe this to be a false report.  It is *printing* that rounds off the 
numbers, not the reading.

You provide no evidence of your assertions: here is a simple counter-example:
> A <- data.frame(a=12.4283675334551)
> write.table(A, "foo")
> AA <- read.table("foo")
> A          a
1 12.42837> AA          a
1 12.42837> print(AA, digits=16)                  a
1 12.4283675334551> AA-A   a
1 0

The last just happens to be true in this example, as there would normally 
be a small representation error.

On Tue, 13 Jun 2006, overeem at knmi.nl wrote:
> Full_Name: Aart Overeem
> Version: 2.2.0
You are explicitly instructed to upgrade before reporting a bug.
> OS: Linux
> Submission from: (NULL) (145.23.254.155)
>
>
> Construct a dataframe consisting of several variables by using 
> 'data.frame' and 'cbind' and write it to a file with
'write.table'. The
> file consists of headers and values, such as 12.4283675334551 (so 13 
> numbers behind the decimal point).
That is, 15 significant digits.
> If this dataframe is read with 'read.table(filename, skip = 1)' or 
> 'read.table(filename, header = TRUE') the values only have 7
numbers
> behind the decimal point, e.g. 12.42837.
Your example has *five*, and is printed to 7 significant digits, the 
default for printing.
> So, the reading rounds off the values. This is not mentioned in the 
> manual. Although the values still have many numbers behind the decimal 
> point, rounding off is, in my view, never desirable.
Nor is submitting false reports.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Seemingly Similar Threads

Search for more apparently analagous threads

R devel - Jun 2006 - undesirable rounding off due to 'read.table' (PR#8974)

[Rd] undesirable rounding off due to 'read.table' (PR#8974)

[Rd] undesirable rounding off due to 'read.table' (PR#8974)

[Rd] undesirable rounding off due to 'read.table' (PR#8974)

[Rd] undesirable rounding off due to 'read.table' (PR#8974)

[Rd] undesirable rounding off due to 'read.table' (PR#8974)

Seemingly Similar Threads