Why does R think these numbers ***are*** equal?
In a somewhat bizarre set of circumstances I calculated
x0 <- 0.03580067
x1 <- 0.03474075
y0 <- 0.4918823
y1 <- 0.4474461
dx <- x1 - x0
dy <- y1 - y0
xx <- (x0 + x1)/2
yy <- (y0 + y1)/2
chk <- yy*dx - xx*dy + x0*dy - y0*dx
If you think about it ***very*** carefully ( :-) ) you'll see that
``chk'' ought to be zero.
Blow me down, R gets 0. Exactly. To as many significant digits/decimal
places
as I can get it to print out.
But .... I wrote a wee function in C to do the *same* calculation and
dyn.load()-ed
it and called it with .C(). And I got -1.248844e-19.
This is of course zero, to all floating point arithmetic intents and
purposes. But if
I name the result returned by my call to .C() ``xxx'' and ask
xxx >= 0
I get FALSE whereas ``chk >= 0'' returns TRUE (as does ``chk <=
0'', of
course).
(And inside my C function, the comparison ``xxx >= 0'' yields
``false''
as well.)
I was vaguely thinking that raw R arithmetic would be equivalent to C
arithmetic.
(Isn't R written in C?)
Can someone explain to me how it is that R (magically) gets it exactly
right, whereas
a call to .C() gives the sort of ``approximately right'' answer that one
might usually
expect? I know that R Core is ***good*** but even they can't make C do
infinite
precision arithmetic. :-)
This is really just idle curiosity --- I realize that this phenomenon is
one that I'll simply have
to live with. But if I can get some deeper insight as to why it occurs,
well, that would
be nice.
cheers,
Rolf Turner
Dear R help,
I am fairly new in data management and programming in R, and am trying to write
what is probably a simple loop, but am not having any luck. I have a dataframe
with something like the following (but much bigger):
Dates<-c("12/10/2010","12/10/2010","12/10/2010","13/10/2010",
"13/10/2010", "13/10/2010")
Groups<-c("A","B","B","A","B","C")
data<-data.frame(Dates, Groups)
I would like to create a new column in the dataframe, and give each distinct
date by group a unique identifying number starting with 1, so that the resulting
column would look something like:
ID<-c(1,2,2,3,4,5)
The loop that I have started to write is something like this (but doesn't
work!):
data$ID<-as.number(c())
for(i in unique(data$Dates)){
for(j in unique(data$Groups)){ data$ID[i,j]<-i
i<-i+1
}
}
Am I on the right track?
Any help on this is much appreciated!
Chandra
[[alternative HTML version deleted]]
On Aug 2, 2011, at 08:02 , Rolf Turner wrote:> > > Why does R think these numbers ***are*** equal? > > In a somewhat bizarre set of circumstances I calculated > > x0 <- 0.03580067 > x1 <- 0.03474075 > y0 <- 0.4918823 > y1 <- 0.4474461 > dx <- x1 - x0 > dy <- y1 - y0 > xx <- (x0 + x1)/2 > yy <- (y0 + y1)/2 > chk <- yy*dx - xx*dy + x0*dy - y0*dx > > If you think about it ***very*** carefully ( :-) ) you'll see that ``chk'' ought to be zero. > > Blow me down, R gets 0. Exactly. To as many significant digits/decimal places > as I can get it to print out. > > But .... I wrote a wee function in C to do the *same* calculation and dyn.load()-ed > it and called it with .C(). And I got -1.248844e-19. > > This is of course zero, to all floating point arithmetic intents and purposes. But if > I name the result returned by my call to .C() ``xxx'' and ask > > xxx >= 0 > > I get FALSE whereas ``chk >= 0'' returns TRUE (as does ``chk <= 0'', of course). > (And inside my C function, the comparison ``xxx >= 0'' yields ``false'' as well.) > > I was vaguely thinking that raw R arithmetic would be equivalent to C arithmetic. > (Isn't R written in C?) > > Can someone explain to me how it is that R (magically) gets it exactly right, whereas > a call to .C() gives the sort of ``approximately right'' answer that one might usually > expect? I know that R Core is ***good*** but even they can't make C do infinite > precision arithmetic. :-) > > This is really just idle curiosity --- I realize that this phenomenon is one that I'll simply have > to live with. But if I can get some deeper insight as to why it occurs, well, that would > be nice.I think the long and the short of it is that R lost a couple of bits of precision that C retained. This sort of thing happens if R stores things into 64 bit floating point objects while C keeps them in 80 bit CPU registers. In general, floating point calculations do not obey the laws of math, for example the associative law (i.e., (a+b)-c ?= a+(b-c), especially if b and c are large and nearly equal), so any reordering of expressions by the compiler may give a slightly different result. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com "D?den skal tape!" --- Nordahl Grieg