thr3ads.net - R devel - [Rd] Strange Behavior in RNG [Aug 2024]

If this information is useful, please help other people find it:
Share via:

Rui Barradas

2024-Aug-17 03:19 UTC

[Rd] Strange Behavior in RNG

?s 01:45 de 17/08/2024, Jiefei Wang escreveu:> Hi,
> 
> I just observed a strange behavior in R. The rnorm function does not
> give me the numbers with a given length. I think it is somehow related
> to the internal representation of double-type numbers but I am not
> sure if this is supposed to happen. Below is a reproducible example
> 
> ```
> ## Create a list, we will only take the forth value, which is 0.6
> nList <- seq(0,1,0.2)
> n <- nList[4]
> n
> # [1] 0.6
> length(rnorm(1000*n))
> # [1] 600
> length(rnorm(1000-1000*n))
> # [1] 399 <--- What happened here?
> length(rnorm(1000-1000*0.6))
> # [1] 400
> 1000-1000*n
> # [1] 400 <- this looks good to me...
> 1000-1000*0.6
> # [1] 400
> identical(n, 0.6)
> # [1] FALSE
> .Internal(inspect(n))
> # @0x00000217c75d79d0 14 REALSXP g0c1 [REF(1)] (len=1, tl=0) 0.6
> .Internal(inspect(0.6))
> # @0x00000217c791e0c8 14 REALSXP g0c1 [REF(2)] (len=1, tl=0) 0.6
> ```
> 
> As you can see, length(rnorm(1000-1000*n)) does not really give me the
> result I want. This is somewhat surprising because it is hard to
> imagine that a manually-typed 0.6 can behave differently than 0.6 from
> a sequence. Furthermore, 0.6 is the only problematic number from
> `nList`. The rest numbers work fine. I can guess it is due to the
> rounding mechanism, but I think this should be treated as a bug: if
> the print function can show the result of 1000-1000*n correctly, it
> will be strange that rnorm behaves differently. Below is my session
> info
> 
> R version 4.3.0 (2023-04-21 ucrt)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows 10 x64 (build 19045)
> 
> Matrix products: default
> 
> locale:
> [1] LC_COLLATE=English_United States.utf8
> [2] LC_CTYPE=English_United States.utf8
> [3] LC_MONETARY=English_United States.utf8
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.utf8
> 
> time zone: America/Chicago
> tzcode source: internal
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-develHello,

This is R FAQ 7.31.
In fact, the sequences

seq(0, 1, 0.1)
seq(0, 1, 0.2)

should probably be a FAQ 7.31 example.
If you print the numbers with more decimals you will see why the error.



# generate the list
nList <- seq(0,1,0.2)
# compare the list with manually typed numbers
nList != c(0, 0.2, 0.4, 0.6, 0.8, 1)
#> [1] FALSE FALSE FALSE  TRUE FALSE FALSE

# note the value of 0.6
print(nList, digits = 16L)
#> [1] 0.0000000000000000 0.2000000000000000 0.4000000000000000 
0.6000000000000001
#> [5] 0.8000000000000000 1.0000000000000000


Hope this helps,

Rui Barradas



-- 
Este e-mail foi analisado pelo software antiv?rus AVG para verificar a presen?a
de v?rus.
www.avg.com

Jiefei Wang

2024-Aug-17 04:11 UTC

head link

[Rd] Strange Behavior in RNG

Hi Rui and John,

Thanks for your reply. I'm not sure if this is a question for R-help as I
think the behavior of RNG is weird, but I will happy to move this
discussion if the admin think this is not their topic.

I was a C/C++ developer so I understand the double-type numbers sometimes
can generate surprising results, but what unexpected here is that even the
number is super close to 400 'rnorm' still rounds it down to 399.
Shouldn't
it be round up in this case? Probably the underlying code just convert the
number into an int type, but I was expecting that the function can tolerate
a certain degree of errors. Maybe I have too much expectations for it...

Best,
Jiefei



On Fri, Aug 16, 2024, 22:19 Rui Barradas <ruipbarradas at sapo.pt> wrote:
> ?s 01:45 de 17/08/2024, Jiefei Wang escreveu:
> > Hi,
> >
> > I just observed a strange behavior in R. The rnorm function does not
> > give me the numbers with a given length. I think it is somehow related
> > to the internal representation of double-type numbers but I am not
> > sure if this is supposed to happen. Below is a reproducible example
> >
> > ```
> > ## Create a list, we will only take the forth value, which is 0.6
> > nList <- seq(0,1,0.2)
> > n <- nList[4]
> > n
> > # [1] 0.6
> > length(rnorm(1000*n))
> > # [1] 600
> > length(rnorm(1000-1000*n))
> > # [1] 399 <--- What happened here?
> > length(rnorm(1000-1000*0.6))
> > # [1] 400
> > 1000-1000*n
> > # [1] 400 <- this looks good to me...
> > 1000-1000*0.6
> > # [1] 400
> > identical(n, 0.6)
> > # [1] FALSE
> > .Internal(inspect(n))
> > # @0x00000217c75d79d0 14 REALSXP g0c1 [REF(1)] (len=1, tl=0) 0.6
> > .Internal(inspect(0.6))
> > # @0x00000217c791e0c8 14 REALSXP g0c1 [REF(2)] (len=1, tl=0) 0.6
> > ```
> >
> > As you can see, length(rnorm(1000-1000*n)) does not really give me the
> > result I want. This is somewhat surprising because it is hard to
> > imagine that a manually-typed 0.6 can behave differently than 0.6 from
> > a sequence. Furthermore, 0.6 is the only problematic number from
> > `nList`. The rest numbers work fine. I can guess it is due to the
> > rounding mechanism, but I think this should be treated as a bug: if
> > the print function can show the result of 1000-1000*n correctly, it
> > will be strange that rnorm behaves differently. Below is my session
> > info
> >
> > R version 4.3.0 (2023-04-21 ucrt)
> > Platform: x86_64-w64-mingw32/x64 (64-bit)
> > Running under: Windows 10 x64 (build 19045)
> >
> > Matrix products: default
> >
> > locale:
> > [1] LC_COLLATE=English_United States.utf8
> > [2] LC_CTYPE=English_United States.utf8
> > [3] LC_MONETARY=English_United States.utf8
> > [4] LC_NUMERIC=C
> > [5] LC_TIME=English_United States.utf8
> >
> > time zone: America/Chicago
> > tzcode source: internal
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> Hello,
>
> This is R FAQ 7.31.
> In fact, the sequences
>
> seq(0, 1, 0.1)
> seq(0, 1, 0.2)
>
> should probably be a FAQ 7.31 example.
> If you print the numbers with more decimals you will see why the error.
>
>
>
> # generate the list
> nList <- seq(0,1,0.2)
> # compare the list with manually typed numbers
> nList != c(0, 0.2, 0.4, 0.6, 0.8, 1)
> #> [1] FALSE FALSE FALSE  TRUE FALSE FALSE
>
> # note the value of 0.6
> print(nList, digits = 16L)
> #> [1] 0.0000000000000000 0.2000000000000000 0.4000000000000000
> 0.6000000000000001
> #> [5] 0.8000000000000000 1.0000000000000000
>
>
> Hope this helps,
>
> Rui Barradas
>
>
>
> --
> Este e-mail foi analisado pelo software antiv?rus AVG para verificar a
> presen?a de v?rus.
> www.avg.com
>
	[[alternative HTML version deleted]]

R devel - Aug 2024 - Strange Behavior in RNG

[Rd] Strange Behavior in RNG

[Rd] Strange Behavior in RNG