Dear list, I have found an unusual behavior and would like to check if it is a possible bug, and if updating R would fix it. I am not sure if should post it in this mail list but I don't where is R bug tracker. The only mention I found that might relate to this is "If times is a computed quantity it is prudent to add a small fuzz." in rep() help, but not sure if it is related to this particular problem Here it goes:> rep(TRUE,29)[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [28] TRUE TRUE> rep(TRUE,0.29*100)[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [28] TRUE> length(rep(TRUE,29))[1] 29> length(rep(TRUE,0.29*100))[1] 28 Just to make sure:> 0.29*100[1] 29 This behavior seems to be independent of what is being repeated (rep()'s first argument)> length(rep(1,0.29*100))[1] 28 Also it occurs only with the 0.29.> length(rep(1,0.291*100))[1] 29> for(a in seq(0,1,0.01)) {print(sum(rep(TRUE,a*100)))} #also shows correctvalues in values from 0 to 1 except for 0.29. I have confirmed that this behavior happens in more than one machine (though I only have session info of this one)> sessionInfo()R version 2.15.3 (2013-03-01) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252 LC_MONETARY=Portuguese_Brazil.1252 [4] LC_NUMERIC=C LC_TIME=Portuguese_Brazil.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] spatstat_1.31-1 deldir_0.0-21 mgcv_1.7-22 loaded via a namespace (and not attached): [1] grid_2.15.3 lattice_0.20-13 Matrix_1.0-11 nlme_3.1-108 tools_2.15.3 [[alternative HTML version deleted]]
FYI,> (0.29*100) < 29[1] TRUE See R FAQ 7.31 for why. /Henrik On Tue, Apr 9, 2013 at 9:11 AM, Jorge Fernando Saraiva de Menezes <jorgefernandosaraiva at gmail.com> wrote:> Dear list, > > I have found an unusual behavior and would like to check if it is a > possible bug, and if updating R would fix it. I am not sure if should post > it in this mail list but I don't where is R bug tracker. The only mention I > found that might relate to this is "If times is a computed quantity it is > prudent to add a small fuzz." in rep() help, but not sure if it is related > to this particular problem > > Here it goes: > >> rep(TRUE,29) > [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > [28] TRUE TRUE >> rep(TRUE,0.29*100) > [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > [28] TRUE >> length(rep(TRUE,29)) > [1] 29 >> length(rep(TRUE,0.29*100)) > [1] 28 > > Just to make sure: >> 0.29*100 > [1] 29 > > This behavior seems to be independent of what is being repeated (rep()'s > first argument) >> length(rep(1,0.29*100)) > [1] 28 > > Also it occurs only with the 0.29. >> length(rep(1,0.291*100)) > [1] 29 >> for(a in seq(0,1,0.01)) {print(sum(rep(TRUE,a*100)))} #also shows correct > values in values from 0 to 1 except for 0.29. > > I have confirmed that this behavior happens in more than one machine > (though I only have session info of this one) > > >> sessionInfo() > R version 2.15.3 (2013-03-01) > Platform: x86_64-w64-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252 > LC_MONETARY=Portuguese_Brazil.1252 > [4] LC_NUMERIC=C LC_TIME=Portuguese_Brazil.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] spatstat_1.31-1 deldir_0.0-21 mgcv_1.7-22 > > loaded via a namespace (and not attached): > [1] grid_2.15.3 lattice_0.20-13 Matrix_1.0-11 nlme_3.1-108 > tools_2.15.3 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
[See at end] On 09-Apr-2013 16:11:18 Jorge Fernando Saraiva de Menezes wrote:> Dear list, > > I have found an unusual behavior and would like to check if it is a > possible bug, and if updating R would fix it. I am not sure if should post > it in this mail list but I don't where is R bug tracker. The only mention I > found that might relate to this is "If times is a computed quantity it is > prudent to add a small fuzz." in rep() help, but not sure if it is related > to this particular problem > > Here it goes: > >> rep(TRUE,29) > [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > [28] TRUE TRUE >> rep(TRUE,0.29*100) > [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > [28] TRUE >> length(rep(TRUE,29)) > [1] 29 >> length(rep(TRUE,0.29*100)) > [1] 28 > > Just to make sure: >> 0.29*100 > [1] 29 > > This behavior seems to be independent of what is being repeated (rep()'s > first argument) >> length(rep(1,0.29*100)) > [1] 28 > > Also it occurs only with the 0.29. >> length(rep(1,0.291*100)) > [1] 29 >> for(a in seq(0,1,0.01)) {print(sum(rep(TRUE,a*100)))} #also shows correct > values in values from 0 to 1 except for 0.29. > > I have confirmed that this behavior happens in more than one machine > (though I only have session info of this one) > > >> sessionInfo() > R version 2.15.3 (2013-03-01) > Platform: x86_64-w64-mingw32/x64 (64-bit) > [1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252 > LC_MONETARY=Portuguese_Brazil.1252 > [4] LC_NUMERIC=C LC_TIME=Portuguese_Brazil.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] spatstat_1.31-1 deldir_0.0-21 mgcv_1.7-22 > > loaded via a namespace (and not attached): > [1] grid_2.15.3 lattice_0.20-13 Matrix_1.0-11 nlme_3.1-108 > tools_2.15.3The basic issue is, believe or not, that despite apparently: 0.29*100 # [1] 29 in "reality": 0.29*100 == 29 # [1] FALSE In other words, as computed by R, 0.29*100 is not exactly equal to 29: 29 - 0.29*100 # [1] 3.552714e-15 The difference is tiny, but it is sufficient to make 0.29*100 slightly smaller than 29, so rep(TRUE,0.29*100) uses the largest integer compatible with "times = 0.29*100", i.e. 28. Hence the recommendation to "add a little fuzz". On the other hand, when you use rep(1,0.291*100) you will be OK: This is because: 29 - 0.291*100 # [1] -0.1 so 0.291*100 is comfortably greater than 29 (but well clear of 30). The reason for the small inaccuracy (compared with "mathematical truth") is that R performs numerical calculations using binary representations of numbers, and there is no exact binary representation of 0.29, so the result of 0.29*100 will be slightly inaccurate. If you do need to do this sort of thing (e.g. the value of "times" will be the result of a calculation) then one useful precaution could be to round the result: round(0.29*100) # [1] 29 29-round(0.29*100) # [1] 0 length(rep(TRUE,0.29*100)) # [1] 28 length(rep(TRUE,round(0.29*100))) # [1] 29 (The default for round() is 0 decimal places, i.e. it rounds to an integer). So, compared with: 0.29*100 == 29 # [1] FALSE we have: round(0.29*100) == 29 # [1] TRUE Hoping this helps, Ted. ------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at wlandres.net> Date: 09-Apr-2013 Time: 17:56:33 This message was sent by XFMail
Possibly R FAQ 7.31 length(rep(TRUE,signif(0.29*100,2))) #[1] 29 A.K. ----- Original Message ----- From: Jorge Fernando Saraiva de Menezes <jorgefernandosaraiva at gmail.com> To: r-help at r-project.org Cc: Sent: Tuesday, April 9, 2013 12:11 PM Subject: [R] rep() fails at times=0.29*100 Dear list, I have found an unusual behavior and would like to check if it is a possible bug, and if updating R would fix it. I am not sure if should post it in this mail list but I don't where is R bug tracker. The only mention I found that might relate to this is "If times is a computed quantity it is prudent to add a small fuzz." in rep() help, but not sure if it is related to this particular problem Here it goes:> rep(TRUE,29)[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [28] TRUE TRUE> rep(TRUE,0.29*100)[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [28] TRUE> length(rep(TRUE,29))[1] 29> length(rep(TRUE,0.29*100))[1] 28 Just to make sure:> 0.29*100[1] 29 This behavior seems to be independent of what is being repeated (rep()'s first argument)> length(rep(1,0.29*100))[1] 28 Also it occurs only with the 0.29.> length(rep(1,0.291*100))[1] 29> for(a in seq(0,1,0.01)) {print(sum(rep(TRUE,a*100)))} #also shows correctvalues in values from 0 to 1 except for 0.29. I have confirmed that this behavior happens in more than one machine (though I only have session info of this one)> sessionInfo()R version 2.15.3 (2013-03-01) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Portuguese_Brazil.1252? LC_CTYPE=Portuguese_Brazil.1252 LC_MONETARY=Portuguese_Brazil.1252 [4] LC_NUMERIC=C? ? ? ? ? ? ? ? ? ? ? LC_TIME=Portuguese_Brazil.1252 attached base packages: [1] stats? ? graphics? grDevices utils? ? datasets? methods? base other attached packages: [1] spatstat_1.31-1 deldir_0.0-21? mgcv_1.7-22 loaded via a namespace (and not attached): [1] grid_2.15.3? ? lattice_0.20-13 Matrix_1.0-11? nlme_3.1-108 tools_2.15.3 ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.