Why does R think these numbers ***are*** equal? In a somewhat bizarre set of circumstances I calculated x0 <- 0.03580067 x1 <- 0.03474075 y0 <- 0.4918823 y1 <- 0.4474461 dx <- x1 - x0 dy <- y1 - y0 xx <- (x0 + x1)/2 yy <- (y0 + y1)/2 chk <- yy*dx - xx*dy + x0*dy - y0*dx If you think about it ***very*** carefully ( :-) ) you'll see that ``chk'' ought to be zero. Blow me down, R gets 0. Exactly. To as many significant digits/decimal places as I can get it to print out. But .... I wrote a wee function in C to do the *same* calculation and dyn.load()-ed it and called it with .C(). And I got -1.248844e-19. This is of course zero, to all floating point arithmetic intents and purposes. But if I name the result returned by my call to .C() ``xxx'' and ask xxx >= 0 I get FALSE whereas ``chk >= 0'' returns TRUE (as does ``chk <= 0'', of course). (And inside my C function, the comparison ``xxx >= 0'' yields ``false'' as well.) I was vaguely thinking that raw R arithmetic would be equivalent to C arithmetic. (Isn't R written in C?) Can someone explain to me how it is that R (magically) gets it exactly right, whereas a call to .C() gives the sort of ``approximately right'' answer that one might usually expect? I know that R Core is ***good*** but even they can't make C do infinite precision arithmetic. :-) This is really just idle curiosity --- I realize that this phenomenon is one that I'll simply have to live with. But if I can get some deeper insight as to why it occurs, well, that would be nice. cheers, Rolf Turner
Dear R help, I am fairly new in data management and programming in R, and am trying to write what is probably a simple loop, but am not having any luck. I have a dataframe with something like the following (but much bigger): Dates<-c("12/10/2010","12/10/2010","12/10/2010","13/10/2010", "13/10/2010", "13/10/2010") Groups<-c("A","B","B","A","B","C") data<-data.frame(Dates, Groups) I would like to create a new column in the dataframe, and give each distinct date by group a unique identifying number starting with 1, so that the resulting column would look something like: ID<-c(1,2,2,3,4,5) The loop that I have started to write is something like this (but doesn't work!): data$ID<-as.number(c()) for(i in unique(data$Dates)){ for(j in unique(data$Groups)){ data$ID[i,j]<-i i<-i+1 } } Am I on the right track? Any help on this is much appreciated! Chandra [[alternative HTML version deleted]]
On Aug 2, 2011, at 08:02 , Rolf Turner wrote:> > > Why does R think these numbers ***are*** equal? > > In a somewhat bizarre set of circumstances I calculated > > x0 <- 0.03580067 > x1 <- 0.03474075 > y0 <- 0.4918823 > y1 <- 0.4474461 > dx <- x1 - x0 > dy <- y1 - y0 > xx <- (x0 + x1)/2 > yy <- (y0 + y1)/2 > chk <- yy*dx - xx*dy + x0*dy - y0*dx > > If you think about it ***very*** carefully ( :-) ) you'll see that ``chk'' ought to be zero. > > Blow me down, R gets 0. Exactly. To as many significant digits/decimal places > as I can get it to print out. > > But .... I wrote a wee function in C to do the *same* calculation and dyn.load()-ed > it and called it with .C(). And I got -1.248844e-19. > > This is of course zero, to all floating point arithmetic intents and purposes. But if > I name the result returned by my call to .C() ``xxx'' and ask > > xxx >= 0 > > I get FALSE whereas ``chk >= 0'' returns TRUE (as does ``chk <= 0'', of course). > (And inside my C function, the comparison ``xxx >= 0'' yields ``false'' as well.) > > I was vaguely thinking that raw R arithmetic would be equivalent to C arithmetic. > (Isn't R written in C?) > > Can someone explain to me how it is that R (magically) gets it exactly right, whereas > a call to .C() gives the sort of ``approximately right'' answer that one might usually > expect? I know that R Core is ***good*** but even they can't make C do infinite > precision arithmetic. :-) > > This is really just idle curiosity --- I realize that this phenomenon is one that I'll simply have > to live with. But if I can get some deeper insight as to why it occurs, well, that would > be nice.I think the long and the short of it is that R lost a couple of bits of precision that C retained. This sort of thing happens if R stores things into 64 bit floating point objects while C keeps them in 80 bit CPU registers. In general, floating point calculations do not obey the laws of math, for example the associative law (i.e., (a+b)-c ?= a+(b-c), especially if b and c are large and nearly equal), so any reordering of expressions by the compiler may give a slightly different result. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com "D?den skal tape!" --- Nordahl Grieg