thr3ads.net - R help - [R] The behavior of match function [Oct 2005]

If this information is useful, please help other people find it:
Share via:

ronggui

2005-Oct-21 03:19 UTC

[R] The behavior of match function

> x<-1:10
> y<-x+1e-20
> x
 [1]  1  2  3  4  5  6  7  8  9 10> y
 [1]  1  2  3  4  5  6  7  8  9 10> identical(x,y)
[1] FALSE> match(x,y) [1]  1  2  3  4  5  6  7  8  9 10

What's the  principle the function use to determine if x match y?

Thank you!
 				


2005-10-21

------
Deparment of Sociology
Fudan University

My new mail addres is ronggui.huang at gmail.com
Blog:http://sociology.yculblog.com

Marc Schwartz

2005-Oct-21 04:42 UTC

head link

[R] The behavior of match function

On Fri, 2005-10-21 at 11:19 +0800, ronggui wrote:> > x<-1:10
> > y<-x+1e-20
> > x
>  [1]  1  2  3  4  5  6  7  8  9 10
> > y
>  [1]  1  2  3  4  5  6  7  8  9 10
> > identical(x,y)
> [1] FALSE
> > match(x,y)
>  [1]  1  2  3  4  5  6  7  8  9 10
> 
> What's the  principle the function use to determine if x match y?
> 
> Thank you!

In this case, you are comparing x (an integer) with y (a numeric):
> x <- 1:10
> y <- x + 1e-20
> class(x)
[1] "integer"> class(y)[1] "numeric"


Now:
> x == y [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

works element-wise, because the differences between the values (1e-20)
are less than:
> .Machine$double.eps[1] 2.220446e-16

which is the smallest positive float such that 1 plus that value != 1.
See ?.Machine for more information on that.

For the same reason:
> match(x, y) [1]  1  2  3  4  5  6  7  8  9 10
> x %in% y [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

both work element-wise.


However, if you used the following for 'y':
> y <- x + 1e-15
Note the results now:
> x == y [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

because you are now have differences that are greater than .Machine
$double.eps.


In general however, when comparing floats, you will want to use
all.equal():
> all.equal(x, y)[1] TRUE

which compares the values within a specified level of tolerance.
See ?all.equal for more information and importantly note the use of
isTRUE() as well:
> isTRUE(all.equal(x, y))[1] TRUE

Using isTRUE() in this way will result in a single TRUE or FALSE result
depending upon the comparison. If the differences happen to be outside
the tolerance level, you get something like the following:
> y <- x + 1e-5
> all.equal(x, y)[1] "Mean relative  difference: 1.818182e-06"

which does not help if all you want is a single boolean result. Thus the
use of isTRUE() helps here:
> isTRUE(all.equal(x, y))[1] FALSE


You should also read R FAQ 7.31 "Why doesn't R think these numbers are
equal?".

HTH,

Marc Schwartz

Maybe Matching Threads

Search for more apparently analagous threads

R help - Oct 2005 - The behavior of match function

[R] The behavior of match function

[R] The behavior of match function

Maybe Matching Threads