FWIW, I suspect this is related to the function R_unif_index that was introduced in src/main/RNG.c around revision 72356, or the way this function is used in do_sample in src/main/random.c. 20.9.18 08:19, Wolfgang Huber scripsit:> Besides wording of the documentation re truncating vs rounding, there is > something peculiar going on with the fractional part of n: > > > table(sample.int(2.5, 1e6, replace = TRUE)) > > ???? 1????? 2????? 3 > 399051 401035 199914 > > > table(sample.int(3, 1e6, replace = TRUE)) > > ???? 1????? 2????? 3 > 332956 332561 334483 > > > table(sample.int(2.01, 1e6, replace = TRUE)) > > ???? 1????? 2????? 3 > 497173 497866?? 4961 > > > sessionInfo() > R Under development (unstable) (2018-09-17 r75319) > Platform: x86_64-apple-darwin17.7.0 (64-bit) > Running under: macOS High Sierra 10.13.6 > > Matrix products: default > BLAS: /Users/whuber/R/lib/libRblas.dylib > LAPACK: /Users/whuber/R/lib/libRlapack.dylib > > locale: > [1] en_US.UTF-8/UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats???? graphics? grDevices utils???? datasets? methods?? base > > other attached packages: > [1] fortunes_1.5-4 > > loaded via a namespace (and not attached): > [1] compiler_3.6.0 tools_3.6.0 > > > 20.9.18 03:00, Dario Strbenac scripsit: >> Good day, >> >> The use of "rounding" also doesn't make sense. If The number is >> halfway between two integers, it is rounded to the nearest even integer. >> >>> round(2.5) >> [1] 2 >> >> -------------------------------------- >> Dario Strbenac >> University of Sydney >> Camperdown NSW 2050 >> Australia >-- With thanks in advance- Wolfgang ------- Wolfgang Huber Principal Investigator, EMBL Senior Scientist European Molecular Biology Laboratory (EMBL) Heidelberg, Germany wolfgang.huber at embl.de http://www.huber.embl.de My book with Susan Holmes: http://www.huber.embl.de/msmb
>>>>> Wolfgang Huber >>>>> on Thu, 20 Sep 2018 08:47:47 +0200 writes:> FWIW, I suspect this is related to the function > R_unif_index that was introduced in src/main/RNG.c around > revision 72356, or the way this function is used in > do_sample in src/main/random.c. Yes, it is just the use of 'dn' instead of 'n' - a one letter thinko I'd say. But *no*, it's much older than revision 72356; e.g., it's already in R version 3.0.0 (2013-04-03) -- "Masked Marvel" but not yet in R version 2.15.3 (2013-03-01) -- "Security Blanket" ---- Here, I clearly think we see a regression bug, and hopefully not one that should trigger often in practice... and -- without any statistics about the consequences out in package space -- I do think we should fix this in code and let the documentation become "great again" ;-) Martin > 20.9.18 08:19, Wolfgang Huber scripsit: >> Besides wording of the documentation re truncating vs >> rounding, there is something peculiar going on with the >> fractional part of n: >> >> > table(sample.int(2.5, 1e6, replace = TRUE)) >> >> ???? 1????? 2????? 3 399051 401035 199914 >> >> > table(sample.int(3, 1e6, replace = TRUE)) >> >> ???? 1????? 2????? 3 332956 332561 334483 >> >> > table(sample.int(2.01, 1e6, replace = TRUE)) >> >> ???? 1????? 2????? 3 497173 497866?? 4961 >>
>>>>> Martin Maechler >>>>> on Thu, 20 Sep 2018 09:20:46 +0200 writes:>>>>> Wolfgang Huber >>>>> on Thu, 20 Sep 2018 08:47:47 +0200 writes:>> FWIW, I suspect this is related to the function >> R_unif_index that was introduced in src/main/RNG.c around >> revision 72356, or the way this function is used in >> do_sample in src/main/random.c. > Yes, it is just the use of 'dn' instead of 'n' > - a one letter thinko I'd say. > But *no*, it's much older than revision 72356; e.g., it's already in > R version 3.0.0 (2013-04-03) -- "Masked Marvel" > but not yet in > R version 2.15.3 (2013-03-01) -- "Security Blanket" > ---- > Here, I clearly think we see a regression bug, and hopefully not > one that should trigger often in practice... > and -- without any statistics about the consequences out in > package space -- > I do think we should fix this in code and let the documentation > become "great again" ;-) We have agreed that this is simply a regression and should be fixed without a change to the documenation. Consequently, ~ 5 minutes ago $ svn log -v -c75338 ------------------------------------------------------------------------ r75338 | maechler | 2018-09-20 17:38:46 +0200 (Thu, 20 Sep 2018) | 1 line Changed paths: M /trunk/doc/NEWS.Rd M /trunk/src/main/random.c M /trunk/tests/reg-tests-1d.R revert sample.int(<non-integer>, k, replace=TRUE) to sane pre_R-3.0.0 behaviour ------------------------------------------------------------------------ This is now back to "correct" behaviour in "R-devel (>= 75338)" (and, as Duncan Murdoch also said by choosing this thread's Subject, this is really a different issue than the "Bias in R's....") Martin