Hi, I just observed a strange behavior in R. The rnorm function does not give me the numbers with a given length. I think it is somehow related to the internal representation of double-type numbers but I am not sure if this is supposed to happen. Below is a reproducible example ``` ## Create a list, we will only take the forth value, which is 0.6 nList <- seq(0,1,0.2) n <- nList[4] n # [1] 0.6 length(rnorm(1000*n)) # [1] 600 length(rnorm(1000-1000*n)) # [1] 399 <--- What happened here? length(rnorm(1000-1000*0.6)) # [1] 400 1000-1000*n # [1] 400 <- this looks good to me... 1000-1000*0.6 # [1] 400 identical(n, 0.6) # [1] FALSE .Internal(inspect(n)) # @0x00000217c75d79d0 14 REALSXP g0c1 [REF(1)] (len=1, tl=0) 0.6 .Internal(inspect(0.6)) # @0x00000217c791e0c8 14 REALSXP g0c1 [REF(2)] (len=1, tl=0) 0.6 ``` As you can see, length(rnorm(1000-1000*n)) does not really give me the result I want. This is somewhat surprising because it is hard to imagine that a manually-typed 0.6 can behave differently than 0.6 from a sequence. Furthermore, 0.6 is the only problematic number from `nList`. The rest numbers work fine. I can guess it is due to the rounding mechanism, but I think this should be treated as a bug: if the print function can show the result of 1000-1000*n correctly, it will be strange that rnorm behaves differently. Below is my session info R version 4.3.0 (2023-04-21 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19045) Matrix products: default locale: [1] LC_COLLATE=English_United States.utf8 [2] LC_CTYPE=English_United States.utf8 [3] LC_MONETARY=English_United States.utf8 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.utf8 time zone: America/Chicago tzcode source: internal
Dear Jiefei Wang, This is really a more appropriate question for the r-help list than for the r-devel list. Neverthless, see item 7.31 in the R FAQ <https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f>, about floating-point arithmetic. I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ -- On 2024-08-16 8:45 p.m., Jiefei Wang wrote:> Caution: External email. > > > Hi, > > I just observed a strange behavior in R. The rnorm function does not > give me the numbers with a given length. I think it is somehow related > to the internal representation of double-type numbers but I am not > sure if this is supposed to happen. Below is a reproducible example > > ``` > ## Create a list, we will only take the forth value, which is 0.6 > nList <- seq(0,1,0.2) > n <- nList[4] > n > # [1] 0.6 > length(rnorm(1000*n)) > # [1] 600 > length(rnorm(1000-1000*n)) > # [1] 399 <--- What happened here? > length(rnorm(1000-1000*0.6)) > # [1] 400 > 1000-1000*n > # [1] 400 <- this looks good to me... > 1000-1000*0.6 > # [1] 400 > identical(n, 0.6) > # [1] FALSE > .Internal(inspect(n)) > # @0x00000217c75d79d0 14 REALSXP g0c1 [REF(1)] (len=1, tl=0) 0.6 > .Internal(inspect(0.6)) > # @0x00000217c791e0c8 14 REALSXP g0c1 [REF(2)] (len=1, tl=0) 0.6 > ``` > > As you can see, length(rnorm(1000-1000*n)) does not really give me the > result I want. This is somewhat surprising because it is hard to > imagine that a manually-typed 0.6 can behave differently than 0.6 from > a sequence. Furthermore, 0.6 is the only problematic number from > `nList`. The rest numbers work fine. I can guess it is due to the > rounding mechanism, but I think this should be treated as a bug: if > the print function can show the result of 1000-1000*n correctly, it > will be strange that rnorm behaves differently. Below is my session > info > > R version 4.3.0 (2023-04-21 ucrt) > Platform: x86_64-w64-mingw32/x64 (64-bit) > Running under: Windows 10 x64 (build 19045) > > Matrix products: default > > locale: > [1] LC_COLLATE=English_United States.utf8 > [2] LC_CTYPE=English_United States.utf8 > [3] LC_MONETARY=English_United States.utf8 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.utf8 > > time zone: America/Chicago > tzcode source: internal > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
?s 01:45 de 17/08/2024, Jiefei Wang escreveu:> Hi, > > I just observed a strange behavior in R. The rnorm function does not > give me the numbers with a given length. I think it is somehow related > to the internal representation of double-type numbers but I am not > sure if this is supposed to happen. Below is a reproducible example > > ``` > ## Create a list, we will only take the forth value, which is 0.6 > nList <- seq(0,1,0.2) > n <- nList[4] > n > # [1] 0.6 > length(rnorm(1000*n)) > # [1] 600 > length(rnorm(1000-1000*n)) > # [1] 399 <--- What happened here? > length(rnorm(1000-1000*0.6)) > # [1] 400 > 1000-1000*n > # [1] 400 <- this looks good to me... > 1000-1000*0.6 > # [1] 400 > identical(n, 0.6) > # [1] FALSE > .Internal(inspect(n)) > # @0x00000217c75d79d0 14 REALSXP g0c1 [REF(1)] (len=1, tl=0) 0.6 > .Internal(inspect(0.6)) > # @0x00000217c791e0c8 14 REALSXP g0c1 [REF(2)] (len=1, tl=0) 0.6 > ``` > > As you can see, length(rnorm(1000-1000*n)) does not really give me the > result I want. This is somewhat surprising because it is hard to > imagine that a manually-typed 0.6 can behave differently than 0.6 from > a sequence. Furthermore, 0.6 is the only problematic number from > `nList`. The rest numbers work fine. I can guess it is due to the > rounding mechanism, but I think this should be treated as a bug: if > the print function can show the result of 1000-1000*n correctly, it > will be strange that rnorm behaves differently. Below is my session > info > > R version 4.3.0 (2023-04-21 ucrt) > Platform: x86_64-w64-mingw32/x64 (64-bit) > Running under: Windows 10 x64 (build 19045) > > Matrix products: default > > locale: > [1] LC_COLLATE=English_United States.utf8 > [2] LC_CTYPE=English_United States.utf8 > [3] LC_MONETARY=English_United States.utf8 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.utf8 > > time zone: America/Chicago > tzcode source: internal > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-develHello, This is R FAQ 7.31. In fact, the sequences seq(0, 1, 0.1) seq(0, 1, 0.2) should probably be a FAQ 7.31 example. If you print the numbers with more decimals you will see why the error. # generate the list nList <- seq(0,1,0.2) # compare the list with manually typed numbers nList != c(0, 0.2, 0.4, 0.6, 0.8, 1) #> [1] FALSE FALSE FALSE TRUE FALSE FALSE # note the value of 0.6 print(nList, digits = 16L) #> [1] 0.0000000000000000 0.2000000000000000 0.4000000000000000 0.6000000000000001 #> [5] 0.8000000000000000 1.0000000000000000 Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antiv?rus AVG para verificar a presen?a de v?rus. www.avg.com