thr3ads.net - R help - [R] Difference between 32-bit and 64-bit version [Jun 2015]

If this information is useful, please help other people find it:
Share via:

Thierry Onkelinx

2015-Jun-03 15:56 UTC

[R] Difference between 32-bit and 64-bit version

Dear all,

I'm a bit puzzled by the difference in an object when created in R 32-bit
and R 64-bit.

Consider the code below. test.rda is available at
https://drive.google.com/file/d/0BzBrlGSuB9n-NFBWeC1TR093Sms/view?usp=sharing

# Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
library(lme4)
load("test.rda")
coef.32 <- coef(test)
save(coef.32, file = "32bit.rda")

# Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
library(lme4)
load("~/test.rda")
coef.64 <- coef(test)
save(coef.64, file = "64bit.rda")


# Compare the results
# Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
# Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
library(lme4)
load("32bit.rda")
load("64bit.rda")
identical(coef.32, coef.64) # FALSE
identical(coef.32$fRow, coef.64$fRow) # FALSE
identical(coef.32$fLocation, coef.64$fLocation) # TRUE
identical(coef.32$fSubLocation, coef.64$fSubLocation) # TRUE

The first comparison is FALSE, because the second is FALSE. But why is the
second FALSE and the third and fourth TRUE?

My goal is the calculate a SHA1 hash on the coef(test) to track if the
coefficients of test have changed. I'd like to get the same hash on a
32-bit and 64-bit system. A simple hack would be to calculate the hash on
round(coef(test), 20). Is that a good or bad idea?

identical(round(coef.32$fRow, 20), round(coef.64$fRow, 20)) # TRUE

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

	[[alternative HTML version deleted]]

Duncan Murdoch

2015-Jun-03 16:09 UTC

head link

[R] Difference between 32-bit and 64-bit version

On 03/06/2015 11:56 AM, Thierry Onkelinx wrote:> Dear all,
> 
> I'm a bit puzzled by the difference in an object when created in R
32-bit
> and R 64-bit.
> 
> Consider the code below. test.rda is available at
>
https://drive.google.com/file/d/0BzBrlGSuB9n-NFBWeC1TR093Sms/view?usp=sharing
> 
> # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
> library(lme4)
> load("test.rda")
> coef.32 <- coef(test)
> save(coef.32, file = "32bit.rda")
> 
> # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
> library(lme4)
> load("~/test.rda")
> coef.64 <- coef(test)
> save(coef.64, file = "64bit.rda")
> 
> 
> # Compare the results
> # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
> # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
> library(lme4)
> load("32bit.rda")
> load("64bit.rda")
> identical(coef.32, coef.64) # FALSE
> identical(coef.32$fRow, coef.64$fRow) # FALSE
> identical(coef.32$fLocation, coef.64$fLocation) # TRUE
> identical(coef.32$fSubLocation, coef.64$fSubLocation) # TRUE
> 
> The first comparison is FALSE, because the second is FALSE. But why is the
> second FALSE and the third and fourth TRUE?
> 
> My goal is the calculate a SHA1 hash on the coef(test) to track if the
> coefficients of test have changed. I'd like to get the same hash on a
> 32-bit and 64-bit system. A simple hack would be to calculate the hash on
> round(coef(test), 20). Is that a good or bad idea?
> 
> identical(round(coef.32$fRow, 20), round(coef.64$fRow, 20)) # TRUE
Different math libraries round differently, so small differences are
expected.  This is FAQ 7.31.  In many cases the 32 bit calculations are
more accurate, because they tend to use more 80 bit extended precision
intermediate values, but that is not guaranteed.

Rounding before comparing makes sense, but I would use signif() instead
of round(), I would choose a relatively small number of significant
digits, and I would expect to see a few false positives:  if the true
value is 0 but some "random" noise is added, I'd expect values
rounded
by signif() to be unequal.

Duncan Murdoch
> 
> Best regards,
> 
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
> Forest
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality
Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
> 
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Thierry Onkelinx

2015-Jun-04 07:59 UTC

head link

[R] Difference between 32-bit and 64-bit version

Dear Duncan,

I had been thinking about FAQ 7.31. I tried to create a dummy dataset with
the same structure to replicate the problem with the need of sending my
dataset. However all of them gave identical() results between 32-bit and
64-bit. Note that coef()$fRow is a 1266 x 6 data.frame. Is it correct to
infer that tiny difference between 32-bit and 64-bit are possible but have
a low probability of occurring?

signif() makes indeed more sense than round(). Using 20 digits gives
identical results, 21 digits gives non identical results.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2015-06-03 18:09 GMT+02:00 Duncan Murdoch <murdoch.duncan at gmail.com>:
> On 03/06/2015 11:56 AM, Thierry Onkelinx wrote:
> > Dear all,
> >
> > I'm a bit puzzled by the difference in an object when created in R
32-bit
> > and R 64-bit.
> >
> > Consider the code below. test.rda is available at
> >
>
https://drive.google.com/file/d/0BzBrlGSuB9n-NFBWeC1TR093Sms/view?usp=sharing
> >
> > # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
> > library(lme4)
> > load("test.rda")
> > coef.32 <- coef(test)
> > save(coef.32, file = "32bit.rda")
> >
> > # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
> > library(lme4)
> > load("~/test.rda")
> > coef.64 <- coef(test)
> > save(coef.64, file = "64bit.rda")
> >
> >
> > # Compare the results
> > # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
> > # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
> > library(lme4)
> > load("32bit.rda")
> > load("64bit.rda")
> > identical(coef.32, coef.64) # FALSE
> > identical(coef.32$fRow, coef.64$fRow) # FALSE
> > identical(coef.32$fLocation, coef.64$fLocation) # TRUE
> > identical(coef.32$fSubLocation, coef.64$fSubLocation) # TRUE
> >
> > The first comparison is FALSE, because the second is FALSE. But why is
> the
> > second FALSE and the third and fourth TRUE?
> >
> > My goal is the calculate a SHA1 hash on the coef(test) to track if the
> > coefficients of test have changed. I'd like to get the same hash
on a
> > 32-bit and 64-bit system. A simple hack would be to calculate the hash
on
> > round(coef(test), 20). Is that a good or bad idea?
> >
> > identical(round(coef.32$fRow, 20), round(coef.64$fRow, 20)) # TRUE
>
> Different math libraries round differently, so small differences are
> expected.  This is FAQ 7.31.  In many cases the 32 bit calculations are
> more accurate, because they tend to use more 80 bit extended precision
> intermediate values, but that is not guaranteed.
>
> Rounding before comparing makes sense, but I would use signif() instead
> of round(), I would choose a relatively small number of significant
> digits, and I would expect to see a few false positives:  if the true
> value is 0 but some "random" noise is added, I'd expect
values rounded
> by signif() to be unequal.
>
> Duncan Murdoch
>
> >
> > Best regards,
> >
> > ir. Thierry Onkelinx
> > Instituut voor natuur- en bosonderzoek / Research Institute for Nature
> and
> > Forest
> > team Biometrie & Kwaliteitszorg / team Biometrics & Quality
Assurance
> > Kliniekstraat 25
> > 1070 Anderlecht
> > Belgium
> >
> > To call in the statistician after the experiment is done may be no
more
> > than asking him to perform a post-mortem examination: he may be able
to
> say
> > what the experiment died of. ~ Sir Ronald Aylmer Fisher
> > The plural of anecdote is not data. ~ Roger Brinner
> > The combination of some data and an aching desire for an answer does
not
> > ensure that a reasonable answer can be extracted from a given body of
> data.
> > ~ John Tukey
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
	[[alternative HTML version deleted]]

R help - Jun 2015 - Difference between 32-bit and 64-bit version

[R] Difference between 32-bit and 64-bit version

[R] Difference between 32-bit and 64-bit version

[R] Difference between 32-bit and 64-bit version