Dear R users, I obtain the following behavior which I cannot understand. Given a file mydata with the following numbers: 0.171409662475182 0.15817339510258108 0.32230311052283256 0.14890800794176043 0.17074784910655194 0.16611515552614162 0.41 0.16611515552614162 0.41760423560555926 0.11978821972203839 I read the data and perform some calculations: a <- 1-read.table("mydata")$V1 m <- outer(a, a, "/") diag(m) <- NA mean.row <- apply(m, 1, mean, na.rm=TRUE) which yield the same value for indices 6 and 8 of mean.row, as would be expected because values 6 and 8 of the original vector are the same: > mean.row[6]==mean.row[8] [1] TRUE However, if I reorder the values as follows: a <- 1-read.table("mydata")$V1[c(10,2,8,9,7,3,1,4,5,6)] and repeat the calculations: m <- outer(a, a, "/") diag(m) <- NA mean.row <- apply(m, 1, mean, na.rm=TRUE) mean.row[6]==mean.row[8] The values for indices 10 and 3 of mean.row, which correspond to 6 and 8 in the previous calculations, are not the same anymore: > mean.row[10]==mean.row[3] [1] FALSE I understand that limited precision causes "incorrect" results but I wouldn't expect ordering operations to do the same. I couldn't find any information in the site about this. Maybe it's a bug with my version: > R.version _ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major 1 minor 7.0 year 2003 month 04 day 16 language R Thanks and best regards, Carlos
I just got the same answer both times with R 1.8.1 under Windows 2000 on an IBM Thinkpad T30. However, if abs(diff(mean.row[c(3,10)])) is not much bigger than .Machine$double.esp*sum(abs(mean.row[c(3,10)])), then I would not call that a bug. Rather, it should be considered a warning not to expect exact equality in comparing floating point numbers Just now, I checked (pi+x-x) == 4*atan(1) and got TRUE for both in R 1.8.1 and S-Plus 2000 with x = 1 but FALSE with x = 100. hope this helps. spencer graves Carlos Soares wrote:> Dear R users, > > I obtain the following behavior which I cannot understand. Given a > file mydata with the following numbers: > 0.171409662475182 > 0.15817339510258108 > 0.32230311052283256 > 0.14890800794176043 > 0.17074784910655194 > 0.16611515552614162 > 0.41 > 0.16611515552614162 > 0.41760423560555926 > 0.11978821972203839 > > I read the data and perform some calculations: > a <- 1-read.table("mydata")$V1 > m <- outer(a, a, "/") > diag(m) <- NA > mean.row <- apply(m, 1, mean, na.rm=TRUE) > > which yield the same value for indices 6 and 8 of mean.row, as would > be expected because values 6 and 8 of the original vector are the same: > > mean.row[6]==mean.row[8] > [1] TRUE > > However, if I reorder the values as follows: > a <- 1-read.table("mydata")$V1[c(10,2,8,9,7,3,1,4,5,6)] > > and repeat the calculations: > m <- outer(a, a, "/") > diag(m) <- NA > mean.row <- apply(m, 1, mean, na.rm=TRUE) > mean.row[6]==mean.row[8] > > The values for indices 10 and 3 of mean.row, which correspond to 6 and > 8 in the previous calculations, are not the same anymore: > > mean.row[10]==mean.row[3] > [1] FALSE > > I understand that limited precision causes "incorrect" results but I > wouldn't expect ordering operations to do the same. I couldn't find > any information in the site about this. Maybe it's a bug with my version: > > R.version > _ platform i686-pc-linux-gnu > arch i686 os linux-gnu system i686, > linux-gnu status major 1 minor > 7.0 year 2003 month 04 > day 16 language R > Thanks and best regards, > Carlos > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
On Fri, 26 Dec 2003, Carlos Soares wrote:> The values for indices 10 and 3 of mean.row, which correspond to 6 and 8 > in the previous calculations, are not the same anymore: > > mean.row[10]==mean.row[3] > [1] FALSE > > I understand that limited precision causes "incorrect" results but I > wouldn't expect ordering operations to do the same. I couldn't find any > information in the site about this. Maybe it's a bug with my version:Almost certainly not. The first thing to do is to see how big the difference is, rather than comparing for equality. mean.row[10]-mean.row[3]. You will find that it's about the same size as .Machine$double.eps, in which case you just have an example of the fact that two floating point numbers computed from different expressions are not reliably equal. There's nothing puzzling about the fact that you just reordered the numbers; floating point addition is not associative. -thomas