Hi all, I am running R version 2.4.0 (2006-10-03) on an i686 pc with Mandrake 10.2 Linux. I was given a binary data file containing single precision numbers that I would like to read into R. In a previous posting, someone suggested reading in such data as double(), which is what I've tried:> zz <- file(file, "rb") > h1 <- readBin(con = zz, what = double(), n = 1, size = 4) > h1[1] 0.0500000007451 Except that I know that the first value should be exactly 0.05. To get rid of the unwanted (or really unknown) values, I try using signif(), which gives me:> h1 <- signif(h1, digits = 8) > h1[1] 0.050000001 I suppose I could use:> h1 <- signif(h1, digits = 7) > h1[1] 0.05 But this does not seem ideal to me. Apparently I don't understand machine precision very well, because I don't understand where the extra values are coming from. So I also don't know if this use of signif() will be reliable for all possible values. What about a value of 1.2e-8? Will this be read in as:> signif(1.200000000034e-8, digits = 7)[1] 1.2e-08 or could this occur?:> signif(1.2000034e-8, digits = 7)[1] 1.200003e-08 Thanks for any advice. Eric Thompson Graduate Student Tufts University Civil & Environmental Engineering Medford, MA 02144
On 11/9/2006 4:20 PM, Eric Thompson wrote:> Hi all, > > I am running R version 2.4.0 (2006-10-03) on an i686 pc with Mandrake > 10.2 Linux. I was given a binary data file containing single precision > numbers that I would like to read into R. In a previous posting, > someone suggested reading in such data as double(), which is what I've > tried: > >> zz <- file(file, "rb") >> h1 <- readBin(con = zz, what = double(), n = 1, size = 4) >> h1 > [1] 0.0500000007451 > > Except that I know that the first value should be exactly 0.05.That's impossible. You can't represent 0.05 in either single or double precision floats. What you're seeing is the error in the single precision version of its representation. Duncan Murdoch To get> rid of the unwanted (or really unknown) values, I try using signif(), > which gives me: > >> h1 <- signif(h1, digits = 8) >> h1 > [1] 0.050000001 > > I suppose I could use: > >> h1 <- signif(h1, digits = 7) >> h1 > [1] 0.05 > > But this does not seem ideal to me. Apparently I don't understand > machine precision very well, because I don't understand where the > extra values are coming from. So I also don't know if this use of > signif() will be reliable for all possible values. What about a value > of 1.2e-8? Will this be read in as: > >> signif(1.200000000034e-8, digits = 7) > [1] 1.2e-08 > > or could this occur?: > >> signif(1.2000034e-8, digits = 7) > [1] 1.200003e-08 > > Thanks for any advice. > > Eric Thompson > Graduate Student > Tufts University > Civil & Environmental Engineering > Medford, MA 02144 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi, what you are observing is the fact that there is always a limit in the precision a floating-point values can be stored. The value you are trying to read is stored in 4 bytes (floats). For higher precision, the value could be stored in 8 bytes (doubles). BTW, R works with 8 byte floating-toint values. Example illustrating this (R --vanilla): # Write data> x <- 0.05 > writeBin(x, con="float.bin", size=4) > writeBin(x, con="double.bin", size=8)# Read data> yF <- readBin("float.bin", what="double", size=4) > yD <- readBin("double.bin", what="double", size=8)# Display data with different precisions> options(digits=7) # Default in R (unless you change it) > getOption("digits")[1] 7> yF[1] 0.05> yD[1] 0.05> options(digits=8) > yF[1] 0.050000001> yD[1] 0.05> options(digits=12) > yF[1] 0.0500000007451> yD[1] 0.05 # Difference between 0.5 stored as double and float> log10(abs(yD-yF))[1] -9.12780988455 # Eventually you will see the same for doubles too: options(digits=22)> 1e-24[1] 9.99999999999999924e-25 Hope this helps! Henrik On 11/10/06, Eric Thompson <eric.thompson at tufts.edu> wrote:> Hi all, > > I am running R version 2.4.0 (2006-10-03) on an i686 pc with Mandrake > 10.2 Linux. I was given a binary data file containing single precision > numbers that I would like to read into R. In a previous posting, > someone suggested reading in such data as double(), which is what I've > tried: > > > zz <- file(file, "rb") > > h1 <- readBin(con = zz, what = double(), n = 1, size = 4) > > h1 > [1] 0.0500000007451 > > Except that I know that the first value should be exactly 0.05. To get > rid of the unwanted (or really unknown) values, I try using signif(), > which gives me: > > > h1 <- signif(h1, digits = 8) > > h1 > [1] 0.050000001 > > I suppose I could use: > > > h1 <- signif(h1, digits = 7) > > h1 > [1] 0.05 > > But this does not seem ideal to me. Apparently I don't understand > machine precision very well, because I don't understand where the > extra values are coming from. So I also don't know if this use of > signif() will be reliable for all possible values. What about a value > of 1.2e-8? Will this be read in as: > > > signif(1.200000000034e-8, digits = 7) > [1] 1.2e-08 > > or could this occur?: > > > signif(1.2000034e-8, digits = 7) > [1] 1.200003e-08 > > Thanks for any advice. > > Eric Thompson > Graduate Student > Tufts University > Civil & Environmental Engineering > Medford, MA 02144 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Apparently Analagous Threads
- how to change the class of a group of objects
- FYI conflict between statnet, igraph
- Single Precision (4 byte) floats with readBin
- about prediction with a factor
- xyplot: Plotting two variables, one as points - the other as line. Can that be done without explicitly using panel functions