thr3ads.net - R help - [R] read-in, error??? [May 2012]

If this information is useful, please help other people find it:
Share via:

Istvan Nemeth

2012-May-04 13:54 UTC

[R] read-in, error???

Dear Users!

I encountered with some problem in data reading while I challenged R (and
me too) in a validation point of view.
In this issue, I tried to utilize some reference datasets (
http://www.itl.nist.gov/div898/strd/index.html).
And the result departed a bit from my expectations. This dataset dedicated
to challenge cancellation and accumulation errors (case SmLs07), that's why
this uncommon look of txt file.

Treatment   Response
           1    1000000000000.4
           1    1000000000000.3
           1    1000000000000.5
           ......
           2    1000000000000.2
           2    1000000000000.4
           .....
           3    1000000000000.4
           3    1000000000000.6
           3    1000000000000.4
           .........
then after a read.table() I expect the same set instead I've got this:

    Treatment              Response
1           1 1000000000000.4000244
2           1 1000000000000.3000488
3           1 1000000000000.5000000
.........
22          2 1000000000000.3000488
23          2 1000000000000.1999512
24          2 1000000000000.4000244
.......
58          3 1000000000000.4000244
59          3 1000000000000.5999756
60          3 1000000000000.4000244
61          3 1000000000000.5999756
62          3 1000000000000.4000244
......
a lots of number from the space. I assume that these numbers come from the
binary representation of such a tricky decimal numbers but my question is
how can I avoid this feature of the binary representation?

Moreover, I wondered that it may raise some question in a regulated
environment.

	[[alternative HTML version deleted]]

Sarah Goslee

2012-May-04 14:07 UTC

head link

[R] read-in, error???

Hi Istvan,

That's most unusual, and quite unlikely (and much larger than the
usual floating-point rounding errors).

Please provide a reproducible example. I assume you got the data from here:
http://www.itl.nist.gov/div898/strd/anova/SmLs07.dat

What did you do with it then? How did you delete the header rows?

What R code did you use to read it in?

What OS and version of R are you working with?

R has been well-validated; it's more likely that you did something
sub-optimal while importing the data.

Sarah

On Fri, May 4, 2012 at 9:54 AM, Istvan Nemeth <furgeurge at gmail.com>
wrote:> Dear Users!
>
> I encountered with some problem in data reading while I challenged R (and
> me too) in a validation point of view.
> In this issue, I tried to utilize some reference datasets (
> http://www.itl.nist.gov/div898/strd/index.html).
> And the result departed a bit from my expectations. This dataset dedicated
> to challenge cancellation and accumulation errors (case SmLs07), that's
why
> this uncommon look of txt file.
>
> Treatment ? Response
> ? ? ? ? ? 1 ? ?1000000000000.4
> ? ? ? ? ? 1 ? ?1000000000000.3
> ? ? ? ? ? 1 ? ?1000000000000.5
> ? ? ? ? ? ......
> ? ? ? ? ? 2 ? ?1000000000000.2
> ? ? ? ? ? 2 ? ?1000000000000.4
> ? ? ? ? ? .....
> ? ? ? ? ? 3 ? ?1000000000000.4
> ? ? ? ? ? 3 ? ?1000000000000.6
> ? ? ? ? ? 3 ? ?1000000000000.4
> ? ? ? ? ? .........
> then after a read.table() I expect the same set instead I've got this:
>
> ? ?Treatment ? ? ? ? ? ? ?Response
> 1 ? ? ? ? ? 1 1000000000000.4000244
> 2 ? ? ? ? ? 1 1000000000000.3000488
> 3 ? ? ? ? ? 1 1000000000000.5000000
> .........
> 22 ? ? ? ? ?2 1000000000000.3000488
> 23 ? ? ? ? ?2 1000000000000.1999512
> 24 ? ? ? ? ?2 1000000000000.4000244
> .......
> 58 ? ? ? ? ?3 1000000000000.4000244
> 59 ? ? ? ? ?3 1000000000000.5999756
> 60 ? ? ? ? ?3 1000000000000.4000244
> 61 ? ? ? ? ?3 1000000000000.5999756
> 62 ? ? ? ? ?3 1000000000000.4000244
> ......
> a lots of number from the space. I assume that these numbers come from the
> binary representation of such a tricky decimal numbers but my question is
> how can I avoid this feature of the binary representation?
>
> Moreover, I wondered that it may raise some question in a regulated
> environment.
>
-- 
Sarah Goslee
http://www.functionaldiversity.org

Sarah Goslee

2012-May-04 14:26 UTC

head link

[R] Fwd: read-in, error???

Hi Istvan,

Your OS and version of R (eg sessionInfo() ) would also be useful, as
would sending your reply to the R-help list and not just to me.

Sarah


---------- Forwarded message ----------
From: Istvan Nemeth <furgeurge at gmail.com>
Date: Fri, May 4, 2012 at 10:20 AM
Subject: Re: [R] read-in, error???
To: Sarah Goslee <sarah.goslee at gmail.com>


Dear Sarah,

Ctrl-C & Ctrl-V the data from the page to a txt file.
Then delete the unwanted "Data:" part

the code is:

LibPath <- getwd()
options(digits=20)

SmLs07 <-
read.table(file.path(LibPath,"SmLs07.txt"),header=T,colClasses
= "numeric")
SmLs07$TrtF <- factor(SmLs07$Treatment)
lm02 <- lm(Response~TrtF,data=SmLs07)
anova(lm02)
summary(lm02)

I hope it helps to reproduce the phenomena.

Thanks,

Istv?n

2012/5/4 Sarah Goslee <sarah.goslee at gmail.com>>
> Hi Istvan,
>
> That's most unusual, and quite unlikely (and much larger than the
> usual floating-point rounding errors).
>
> Please provide a reproducible example. I assume you got the data from here:
> http://www.itl.nist.gov/div898/strd/anova/SmLs07.dat
>
> What did you do with it then? How did you delete the header rows?
>
> What R code did you use to read it in?
>
> What OS and version of R are you working with?
>
> R has been well-validated; it's more likely that you did something
> sub-optimal while importing the data.
>
> Sarah
>
> On Fri, May 4, 2012 at 9:54 AM, Istvan Nemeth <furgeurge at
gmail.com> wrote:
> > Dear Users!
> >
> > I encountered with some problem in data reading while I challenged R
(and
> > me too) in a validation point of view.
> > In this issue, I tried to utilize some reference datasets (
> > http://www.itl.nist.gov/div898/strd/index.html).
> > And the result departed a bit from my expectations. This dataset
dedicated
> > to challenge cancellation and accumulation errors (case SmLs07),
that's why
> > this uncommon look of txt file.
> >
> > Treatment ? Response
> > ? ? ? ? ? 1 ? ?1000000000000.4
> > ? ? ? ? ? 1 ? ?1000000000000.3
> > ? ? ? ? ? 1 ? ?1000000000000.5
> > ? ? ? ? ? ......
> > ? ? ? ? ? 2 ? ?1000000000000.2
> > ? ? ? ? ? 2 ? ?1000000000000.4
> > ? ? ? ? ? .....
> > ? ? ? ? ? 3 ? ?1000000000000.4
> > ? ? ? ? ? 3 ? ?1000000000000.6
> > ? ? ? ? ? 3 ? ?1000000000000.4
> > ? ? ? ? ? .........
> > then after a read.table() I expect the same set instead I've got
this:
> >
> > ? ?Treatment ? ? ? ? ? ? ?Response
> > 1 ? ? ? ? ? 1 1000000000000.4000244
> > 2 ? ? ? ? ? 1 1000000000000.3000488
> > 3 ? ? ? ? ? 1 1000000000000.5000000
> > .........
> > 22 ? ? ? ? ?2 1000000000000.3000488
> > 23 ? ? ? ? ?2 1000000000000.1999512
> > 24 ? ? ? ? ?2 1000000000000.4000244
> > .......
> > 58 ? ? ? ? ?3 1000000000000.4000244
> > 59 ? ? ? ? ?3 1000000000000.5999756
> > 60 ? ? ? ? ?3 1000000000000.4000244
> > 61 ? ? ? ? ?3 1000000000000.5999756
> > 62 ? ? ? ? ?3 1000000000000.4000244
> > ......
> > a lots of number from the space. I assume that these numbers come from
the
> > binary representation of such a tricky decimal numbers but my question
is
> > how can I avoid this feature of the binary representation?
> >
> > Moreover, I wondered that it may raise some question in a regulated
> > environment.
> >
>
> --
> Sarah Goslee
> http://www.functionaldiversity.org



-- 
Sarah Goslee
http://www.stringpage.com
http://www.sarahgoslee.com
http://www.functionaldiversity.org

William Dunlap

2012-May-04 15:00 UTC

head link

[R] read-in, error???

You didn't mention it, but did you use something like
   options(digits=20)
before displaying that data?  In any case,
  >  1000000000000.4000244 == 1000000000000.4
  [1] TRUE
because R uses the IEEE-754 double precision floating point
arithmetic that all modern computers support.  That gives
you 52 binary digits of precision, somewhat less than 17 decimal
digits, so your difference in the 18th digit is ignored.

If you need more than 16 decimal digits of precision, you
could break the numbers into parts (via string manipulation,
before reading them as numbers) and or use a high precision
package like Rmpfr to manipulate them (it will be slow and  has
limited functionality).

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at
r-project.org] On Behalf
> Of Istvan Nemeth
> Sent: Friday, May 04, 2012 6:55 AM
> To: r-help at r-project.org
> Subject: [R] read-in, error???
> 
> Dear Users!
> 
> I encountered with some problem in data reading while I challenged R (and
> me too) in a validation point of view.
> In this issue, I tried to utilize some reference datasets (
> http://www.itl.nist.gov/div898/strd/index.html).
> And the result departed a bit from my expectations. This dataset dedicated
> to challenge cancellation and accumulation errors (case SmLs07), that's
why
> this uncommon look of txt file.
> 
> Treatment   Response
>            1    1000000000000.4
>            1    1000000000000.3
>            1    1000000000000.5
>            ......
>            2    1000000000000.2
>            2    1000000000000.4
>            .....
>            3    1000000000000.4
>            3    1000000000000.6
>            3    1000000000000.4
>            .........
> then after a read.table() I expect the same set instead I've got this:
> 
>     Treatment              Response
> 1           1 1000000000000.4000244
> 2           1 1000000000000.3000488
> 3           1 1000000000000.5000000
> .........
> 22          2 1000000000000.3000488
> 23          2 1000000000000.1999512
> 24          2 1000000000000.4000244
> .......
> 58          3 1000000000000.4000244
> 59          3 1000000000000.5999756
> 60          3 1000000000000.4000244
> 61          3 1000000000000.5999756
> 62          3 1000000000000.4000244
> ......
> a lots of number from the space. I assume that these numbers come from the
> binary representation of such a tricky decimal numbers but my question is
> how can I avoid this feature of the binary representation?
> 
> Moreover, I wondered that it may raise some question in a regulated
> environment.
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Reasonably Related Threads

Search for more seemingly similar threads

R help - May 2012 - read-in, error???

[R] read-in, error???

[R] read-in, error???

[R] Fwd: read-in, error???

[R] read-in, error???

Reasonably Related Threads