thr3ads.net - R devel - [Rd] Non identical numerical results from R code vs C/C++ code? [Sep 2010]

If this information is useful, please help other people find it:
Share via:

Renaud Gaujoux

2010-Sep-10 10:46 UTC

[Rd] Non identical numerical results from R code vs C/C++ code?

Hi,

suppose you have two versions of the same algorithm: one in pure R, the 
other one in C/C++ called via .Call().
Assuming there is no bug in the implementations (i.e. they both do the 
same thing), is there any well known reason why the C/C++ implementation 
could return numerical results non identical to the one obtained from 
the pure R code? (e.g. could it be rounding errors? please explain.)
Has anybody had a similar experience?

By not identical, I mean very small differences (< 2.4 e-14), but enough 
to have identical() returning FALSE. Maybe I should not bother, but I 
want to be sure where the differences come from, at least by mere curiosity.

Briefly the R code perform multiple matrix product; the C code is an 
optimization of those specific products via custom for loops, where 
entries are not computed in the same order, etc... which improves both 
memory usage and speed. The result is theoretically the same.

Thank you,
Renaud

-- 
Renaud Gaujoux
Computational Biology - University of Cape Town
South Africa


 

###
UNIVERSITY OF CAPE TOWN 

This e-mail is subject to the UCT ICT policies and e-mail disclaimer published
on our website at http://www.uct.ac.za/about/policies/emaildisclaimer/ or
obtainable from +27 21 650 4500. This e-mail is intended only for the person(s)
to whom it is addressed. If the e-mail has reached you in error, please notify
the author. If you are not the intended recipient of the e-mail you may not use,
disclose, copy, redirect or print the content. If this e-mail is not related to
the business of UCT it is sent by the sender in the sender's individual
capacity.

###

Duncan Murdoch

2010-Sep-10 11:00 UTC

head link

[Rd] Non identical numerical results from R code vs C/C++ code?

On 10/09/2010 6:46 AM, Renaud Gaujoux wrote:> Hi,
> 
> suppose you have two versions of the same algorithm: one in pure R, the 
> other one in C/C++ called via .Call().
> Assuming there is no bug in the implementations (i.e. they both do the 
> same thing), is there any well known reason why the C/C++ implementation 
> could return numerical results non identical to the one obtained from 
> the pure R code? (e.g. could it be rounding errors? please explain.)
> Has anybody had a similar experience?
R often uses extended reals (80 bit floating point values on Intel 
chips) for intermediate values.  C compilers may or may not do
that.> 
> By not identical, I mean very small differences (< 2.4 e-14), but enough
> to have identical() returning FALSE. Maybe I should not bother, but I 
> want to be sure where the differences come from, at least by mere
curiosity.
> 
> Briefly the R code perform multiple matrix product; the C code is an 
> optimization of those specific products via custom for loops, where 
> entries are not computed in the same order, etc... which improves both 
> memory usage and speed. The result is theoretically the same.
Changing the order of operations will often affect rounding.  For 
example, suppose epsilon is the smallest number such that 1 + epsilon is 
not equal to 1.  Then 1 + (epsilon/2) + (epsilon/2) will evaluate to 
either 1 or 1 + epsilon, depending on the order of computing the additions.

Duncan Murdoch
> 
> Thank you,
> Renaud
>

Barry Rowlingson

2010-Sep-10 11:24 UTC

head link

[Rd] Non identical numerical results from R code vs C/C++ code?

On Fri, Sep 10, 2010 at 11:46 AM, Renaud Gaujoux
<renaud at mancala.cbio.uct.ac.za> wrote:> Hi,
>
> suppose you have two versions of the same algorithm: one in pure R, the
> other one in C/C++ called via .Call().
> Assuming there is no bug in the implementations (i.e. they both do the same
> thing), is there any well known reason why the C/C++ implementation could
> return numerical results non identical to the one obtained from the pure R
> code? (e.g. could it be rounding errors? please explain.)
> Has anybody had a similar experience?
>
> By not identical, I mean very small differences (< 2.4 e-14), but enough
to
> have identical() returning FALSE. Maybe I should not bother, but I want to
> be sure where the differences come from, at least by mere curiosity.
>
> Briefly the R code perform multiple matrix product; the C code is an
> optimization of those specific products via custom for loops, where entries
> are not computed in the same order, etc... which improves both memory usage
> and speed. The result is theoretically the same.
 I've had almost exactly this situation recently with an algorithm I
first implemented in R and then in C. Guess what the problem was? Yes,
a bug in the C code.

 At first I thought everything was okay because the returned values
were close-ish, and I thought 'oh, rounding error', but I wasn't
happy
about that. So I dug a bit deeper. This is worth doing even if you are
sure there no bugs in the C code (remember: there's always one more
bug).

 First, construct a minimal dataset so you can demonstrate the problem
with a manageable size of matrix. I went down to 7 data points. Then
get the C to print out its inputs. Identical, and as expected? Good.

 Now debug intermediate calculations, printing or saving from C and
checking they are the same as the intermediate calculations from R. If
possible, try returning intermediate calculations in C and checking
identical() with R intermediates.

 Eventually you will find out where the first diverge - and this is
the important bit. It's no use just knowing the final answers come out
different, it's likely your answer has a sensitive dependence on
initial conditions. You need to track down when the first bits are
different.

 Or it could be rounding error...

Barry

Paul Gilbert

2010-Sep-10 14:07 UTC

head link

[Rd] Non identical numerical results from R code vs C/C++ code?

With fortran I have always managed to be able to get identical results
on the same computer with the same OS. You will have trouble if you
switch OS or hardware, or try the same hardware and OS with different
math libraries. All the real calculations need to be double, even
intermediate variables. Also, I've had trouble with arrays not being
initialized to double 0.0. If you initialize to single 0.0 the
straggling bits can cause differences. You also need to be careful about
conversion from integer to real, to do double conversion. I'm not sure
about C, but I would guess there are some of the same problems.

Paul
>-----Original Message-----
>From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-
>project.org] On Behalf Of Renaud Gaujoux
>Sent: September 10, 2010 6:47 AM
>To: r-devel at r-project.org
>Subject: [Rd] Non identical numerical results from R code vs C/C++
code?>
>Hi,
>
>suppose you have two versions of the same algorithm: one in pure R, the
>other one in C/C++ called via .Call().
>Assuming there is no bug in the implementations (i.e. they both do the
>same thing), is there any well known reason why the C/C++
implementation>could return numerical results non identical to the one obtained from
>the pure R code? (e.g. could it be rounding errors? please explain.)
>Has anybody had a similar experience?
>
>By not identical, I mean very small differences (< 2.4 e-14), but
enough>to have identical() returning FALSE. Maybe I should not bother, but I
>want to be sure where the differences come from, at least by mere
>curiosity.
>
>Briefly the R code perform multiple matrix product; the C code is an
>optimization of those specific products via custom for loops, where
>entries are not computed in the same order, etc... which improves both
>memory usage and speed. The result is theoretically the same.
>
>Thank you,
>Renaud
>
>--
>Renaud Gaujoux
>Computational Biology - University of Cape Town
>South Africa
>
>
>
>
>###
>UNIVERSITY OF CAPE TOWN
>
>This e-mail is subject to the UCT ICT policies and e-mail disclaimer
>published on our website at
>http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from
>+27 21 650 4500. This e-mail is intended only for the person(s) to whom
>it is addressed. If the e-mail has reached you in error, please notify
>the author. If you are not the intended recipient of the e-mail you may
>not use, disclose, copy, redirect or print the content. If this e-mail
>is not related to the business of UCT it is sent by the sender in the
>sender's individual capacity.
>
>###
>
>______________________________________________
>R-devel at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-devel===================================================================================
La version fran?aise suit le texte anglais.

------------------------------------------------------------------------------------

This email may contain privileged and/or confidential information, and the Bank
of
Canada does not waive any related rights. Any distribution, use, or copying of
this
email or the information it contains by other than the intended recipient is
unauthorized. If you received this email in error please delete it immediately
from
your system and notify the sender promptly by email that you have done so. 

------------------------------------------------------------------------------------

Le pr?sent courriel peut contenir de l'information privil?gi?e ou
confidentielle.
La Banque du Canada ne renonce pas aux droits qui s'y rapportent. Toute
diffusion,
utilisation ou copie de ce courriel ou des renseignements qu'il contient par
une
personne autre que le ou les destinataires d?sign?s est interdite. Si vous
recevez
ce courriel par erreur, veuillez le supprimer imm?diatement et envoyer sans
d?lai ?
l'exp?diteur un message ?lectronique pour l'aviser que vous avez ?limin?
de votre
ordinateur toute copie du courriel re?u.

Seemingly Similar Threads

Search for more reasonably related threads

R devel - Sep 2010 - Non identical numerical results from R code vs C/C++ code?

[Rd] Non identical numerical results from R code vs C/C++ code?

[Rd] Non identical numerical results from R code vs C/C++ code?

[Rd] Non identical numerical results from R code vs C/C++ code?

[Rd] Non identical numerical results from R code vs C/C++ code?

Seemingly Similar Threads