Paul Lemmens
2003-May-27 12:53 UTC
[R] Numbers that look equal, should be equal, but if() doesn't see as equal
Hi! After a lot of testing and debugging I'm falling silent in figuring out what goes wrong in the following. I'm implementing the Vincentizing procedure that Ratcliff (1979) described. It's about calculating RT bins for any distribution of RT data. It boils down to rank ordering your data, replicating each data point as many times as you need bins and then splitting up the resulting distribution in equal bins. The code that I've written is attached (and not included because it is considerable in length due to many comments). Ratcliff.r contains some basic functions and distribution.bins.r contains the problematic function bins.factor() (problem area marked with 'FAILING TEST'). The final attached file is the mock up distribution I made. The failing test is the check if the mean of the mean RT's for each bin equals the mean of the original distribution. These should/are mathematically equivalent. Sometimes, however, the test fails. With the attached distribution most notably for 4, 7, 8, 9, and 13 bins. Since the means are mathematically equivalent IMHO it should not be an issue of this particular distribution. As a matter of fact, I also have tested some rnorm() distributions and my function also fails on those (albeit a little less often than with foobar.txt). Problem description: if one calculates the bins or bin means by hand, the mean of the bin means is visually the same as the overall mean, even with options(digits=20), but *still* the test fails. IMHO it's not my code and neither the distribution I use to test, but still, can you point out an obvious failure of my programming or is it indeed something of R that I don't yet grasp? thank you for your help, Paul -- Paul Lemmens NICI, University of Nijmegen ASCII Ribbon Campaign /"\ Montessorilaan 3 (B.01.03) Against HTML Mail \ / NL-6525 HR Nijmegen X The Netherlands / \ Phonenumber +31-24-3612648 Fax +31-24-3616066 -------------- next part -------------- "RT" "Cond" "1" 1 "A" "2" 1 "A" "3" 1 "A" "4" 2 "A" "5" 2 "A" "6" 3 "A" "7" 3 "A" "8" 3 "A" "9" 3 "A" "10" 3 "A" "11" 4 "A" "12" 4 "A" "13" 4 "A" "14" 4 "A" "15" 5 "A" "16" 5 "A" "17" 5 "A" "18" 5 "A" "19" 5 "A" "20" 5 "A" "21" 5 "A" "22" 6 "A" "23" 6 "A" "24" 6 "A" "25" 6 "A" "26" 6 "A" "27" 6 "A" "28" 6 "A" "29" 6 "A" "30" 6 "A" "31" 7 "A" "32" 7 "A" "33" 7 "A" "34" 7 "A" "35" 8 "A" "36" 8 "A" "37" 8 "A" "38" 9 "A" "39" 9 "A" "40" 10 "A" "41" 2 "B" "42" 2 "B" "43" 2 "B" "44" 4 "B" "45" 4 "B" "46" 6 "B" "47" 6 "B" "48" 6 "B" "49" 6 "B" "50" 6 "B" "51" 8 "B" "52" 8 "B" "53" 8 "B" "54" 8 "B" "55" 10 "B" "56" 10 "B" "57" 10 "B" "58" 10 "B" "59" 10 "B" "60" 10 "B" "61" 10 "B" "62" 12 "B" "63" 12 "B" "64" 12 "B" "65" 12 "B" "66" 12 "B" "67" 12 "B" "68" 12 "B" "69" 12 "B" "70" 12 "B" "71" 14 "B" "72" 14 "B" "73" 14 "B" "74" 14 "B" "75" 16 "B" "76" 16 "B" "77" 16 "B" "78" 18 "B" "79" 18 "B" "80" 20 "B" "81" 3 "C" "82" 3 "C" "83" 3 "C" "84" 6 "C" "85" 6 "C" "86" 9 "C" "87" 9 "C" "88" 9 "C" "89" 9 "C" "90" 9 "C" "91" 12 "C" "92" 12 "C" "93" 12 "C" "94" 12 "C" "95" 15 "C" "96" 15 "C" "97" 15 "C" "98" 15 "C" "99" 15 "C" "100" 15 "C" "101" 15 "C" "102" 18 "C" "103" 18 "C" "104" 18 "C" "105" 18 "C" "106" 18 "C" "107" 18 "C" "108" 18 "C" "109" 18 "C" "110" 18 "C" "111" 21 "C" "112" 21 "C" "113" 21 "C" "114" 21 "C" "115" 24 "C" "116" 24 "C" "117" 24 "C" "118" 27 "C" "119" 27 "C" "120" 30 "C"
Prof Brian Ripley
2003-May-27 13:12 UTC
[R] Numbers that look equal, should be equal, but if() doesn't see as equal
?all.equal may help you. In the absence of any of your code, there is not much we can do, except to comment that if() (of your subject line) only knows about TRUE and FALSE, so we can only guess at what you used to test equality. On Tue, 27 May 2003, Paul Lemmens wrote:> After a lot of testing and debugging I'm falling silent in figuring out > what goes wrong in the following. > > I'm implementing the Vincentizing procedure that Ratcliff (1979) described. > It's about calculating RT bins for any distribution of RT data. It boils > down to rank ordering your data, replicating each data point as many times > as you need bins and then splitting up the resulting distribution in equal > bins. > > The code that I've written is attached (and not included because it is > considerable in length due to many comments).No code arrived here. What `attached (and not included' means is unclear to me, but only the dataset arrived. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Thomas Lumley
2003-May-27 14:20 UTC
[R] Numbers that look equal, should be equal, but if() doesn't see as equal
On Tue, 27 May 2003, Paul Lemmens wrote:> > Problem description: if one calculates the bins or bin means by hand, the > mean of the bin means is visually the same as the overall mean, even with > options(digits=20), but *still* the test fails. >It is possible at least on some systems for numbers to print the same but not be ==. It's very difficult to ensure that two different floating point numbers are ==, so it is almost always better to check that the difference is small, which is what all.equal() does. To get floating point equality you need not just mathematical equivalence but quite a lot of care in handling rounding -- and it can still be broken quite easily by optimising compilers. If you look at the R tests directory you will see quite a lot of places where mathematically identical quantities are compared with relatively wide tolerances for exactly this reason. Usually getting within 10^-10 or so is sufficient and easily achievable. -thomas
Seemingly Similar Threads
- Numbers that look equal, should be equal, but if() doesn't see as equal (repost with code included)
- Accessing columns in data.frame using formula
- Cbind warning message
- subset(..., drop=TRUE) doesn't seem to work.
- is.na(v)<-b (was: Re: Beginner's query - segmentation fault)